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Empirical  studies  of  markets  in  disequilibrium  have  relied  on  the 
appropriateness  of  explicit  price  adjustment  equations,  serial 
independence,  normally  distributed  errors,  and  explicit  equations 
relating  the  observed  quantity  transacted  to  desired  supply  and  demand. 
For  example,  the  asymptotic  properties  of  "disequilibrium"  estimators 
and  test  statistics  are  sensitive  to  the  parametric  forms  chosen  for 
price  adjustment,  the  serial  behavior  of  the  observations,   error 
distributions,  and  the  quantity  transacted.   In  a  word,  "disequilibrium" 
estimators  and  statistics  are  non-robust.   Unfortunately,  economic 
theory  provides  little  basis  for  choosing  the  parametric  forms.  A  lack 
of  economic-theoretic  restrictions  coupled  with  non-robust  estimators 
and  statistics  has  severely  limited  empirical  studies  of  markets  in 
disequilibrium. 
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This  dissertation  develops  new  methods  for  more  meaningful 
estimation  of  disequilibrium  models.   The  new  methods  involve  more 
general  models  and  robust  estimators. 

A  switching  regression  model  with  imperfect  sample  separation  is 
used  to  incorporate  price  adjustment  into  a  disequilibrium  model.   The 
model  enables  price  adjustment  to  be  incorporated  with  less  a_   priori 
information  than  usual.   To  estimate  the  model,  maximum  likelihood  and 
least  squares  estimators  are  proposed. 

The  asymptotic  properties  of  the  maximum  likelihood  estimator  are 
examined.   Previous  results  for  maximum  likelihood  estimators  of 
disequilibrium  models  are  generalized  with  asymptotic  theory  for 
serially  dependent  observations.   The  maximum  likelihood  estimator  is 
shown  to  be  consistent  and  asymptotically  normal  even  if  the  data  are 
characterized  by  unknown  forms  of  serial  dependence.  Asymptotic  test 
statistics  are  also  derived. 

The  methodology  is  illustrated  with  an  empirical  application  to  the 
U.S.  commercial  loan  market  from  1979  to  1984. 

Finally,  I  propose  semiparametric  models  and  estimators  for  markets 
in  disequilibrium.   These  methods  are  applicable  when  the  error 
distributions  are  unknown,  and  the  quantity  transacted  is  an  unknown 
function  of  supply  and  demand.   Consistent  estimators  are  derived  using 
the  method  of  maximum  score. 
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CHAPTER  1 
AN  OVERVIEW 


1.1  The  Problem 

Before  Fair  and  Jaffee  (1972)  introduced  their  econometric 
disequilibrium  model,  estimation  of  market  behavior  was  confined  to  the 
equilibrium  assumption.   The  study  of  the  econometrics  of  disequilibrium 
was  further  developed  by  Fair  and  Kelejian  (1974),  Maddala  and  Nelson 
(1974),  Amemiya  (1974),  Goldfeld  and  Quandt  (1975),  Laffont  and  Garcia 
(1977),  Bowden  (1978),  and  some  others.   By  allowing  for  disequilibrium, 
Fair  and  Jaffee' s  work  and  the  subsequent  work  it  inspired  represents  an 
important  generalization;  but  a  generalization  obtained  by  imposing 

(1)  explicit  price  adjustment  equations, 

(2)  serial  independence  on  time  series  data, 

(3)  normally  distributed  error  terms,  and 

(4)  explicit  equations  relating  the  observed  quantity  transacted 
to  desired  supply  and  demand. 

This  contrasts  with  the  equilibrium  assumption  where 

(1)  price  adjustment  is  not  an  estimation  issue, 

(2)  allowances  are  made  for  serial  correlation, 

(3)  the  errors  are  only  assumed  to  be  uncorrelated  with  a  subset  of  the 
explanatory  variables,  and 

(4)  desired  supply  and  demand  are  directly  observable. 


The  estimation  of  market  behavior  has  been  extended  to  the 
disequilibrium  assumption,  but  at  a  cost. 

Economic  theory  for  markets  in  disequilibrium  is  a  relatively  new 
area  of  research  that  has  been  developed  in  the  last  few  years  by 
Benassy  (1982),  and  Fisher  (1983),  among  others.   Being  recent 
phenomenona,  however,  the  theories  that  have  been  proposed  are  rather 
limited  in  scope  and  tentative.   For  the  empirical  researcher,  the 
theories  provide  little  guidance  for  specifying  price  adjustment,  and 
the  quantity  transacted  as  a  function  of  desired  supply  and  demand;  they 
provide  no  basis  for  specifying  the  error  distributions  and  serial 
independence.  A  survey  of  the  many  empirical  studies  that  have  followed 
Fair  and  Jaffee  (1972)  suggests  that  the  basis  for  specifying  these 
aspects  of  the  econometric  disequilibrium  model  has  been  largely 
computational  tractability.   This  approach  has  led  to  several 
drastically  different  disequilibrium  specifications.   The  assumptions 
of  each  specification  generally  do  not  represent  well-defined 
economic-theoretic  restrictions,  and  thus  differences  among  them  seldom 
reflect  differences  among  well-defined  alternative  economic  theories. 
As  a  result,  most  disequilibrium  specifications  are  as  good  (or  bad)  as 
any  other.   Unfortunately,  each  specification  produces  estimates  only  as 
reliable  as  the  assumptions  imposed,  and  differences  among  them  can  lead 
to  conflicting  estimates  of  supply  and  demand  equations. 

The  lack  of  economic-theoretic  restrictions  alone  does  not  prohibit 
meaningful  estimation  of  a  disequilibrium  model.   The  estimators 
commonly  applied  are  also  prohibitive.   Most  proposed  "disequilibrium" 
estimators  can  be  viewed  as  corrected  versions  of  the  "equilibrium" 


2 
least  squares  (LS)  and  maximum  likelihood  (ML)  estimators.   The 

inequality  of  supply  and  demand  introduces  nonzero  correlation  between 

the  explanatory  variables  and  the  error  terms.   Given  a  model  for  the 

inequality  of  supply  and  demand,  the  "equilibrium"  LS  and  ML  estimators 

can  be  corrected  for  the  nonzero  correlation  to  yield  consistent 

"disequilibrium"  LS  and  ML  estimators.   The  correction  approach  provides 

insight  into  the  problem  of  relaxing  the  equilibrium  assumption,  but 

generally  requires  restrictive  assumptions  to  make  it  operational.   In 

particular,  consistent  LS  and  ML  estimation  of  a  disequilibrium  model 

depends  on  choosing  the  correct  the  parametric  forms  for  price 

adjustment,  the  error  distribution  functions,  and  the  quantity 

transacted;  useful  inferences  require  allowances  for  serial  correlation 

as  well  as  correct  parametric  forms.   Non-robustness  coupled  with  a  lack 

of  economic-theoretic  restrictions  severely  limits  the  reliability  of  LS 

and  ML  estimation. 

To  illustrate  these  points  we  consider  the  following  model. 

Dt=3°xt+£lt  (1.1) 

St=02xt+e2t  (1'2) 

Data:  (Q^)^. 

Equilibrium  assumption:    Q  =  D  =  S  . 
Disequilibrium  assumption:  D   *  S 

Q  =  t  (D  ,S  ) 
yt    t  t'  t 

Ap   ,  =  n  (D  -S  ). 
*t+l    t   t   t 

Equations  (1.1)  and  (1.2)  are  demand  and  supply  functions;  D  denotes  the 
quantity  demanded,  S  the  quantity  supplied,  x  a  vector  of  explanatory 
variables,  e.   and  e   denote  random  error  terms.   Under  the  equilibrium 


assumption  the  observed  quantity  transacted,  Q,  is  equal  to  both  D  and 

S;  data  are  observed  after  prices  adjust,  and  therefore  adjustment 

models  are  irrelevant.   Under  the  disequilibrium  assumption  D  and  S  are 

not  necessarily  observable,  the  function  t  (.,.)  specifies  the  position 

of  D  and  S  relative  to  the  observable  Q  ;  data  reflect  adjustments  at 
t      t  t 

various  stages,  and  therefore  it  becomes  meaningful  to  model  price 
adjustment.   Price  adjustments  are  modeled  as  follows:  the  price  change, 
Ap  .  =  p  1  -  p  ,  depends  on  excess  demand,  D  -S  ,  through  the  function 

V 

When  LS  and  ML  are  applied  under  the  disequilibrium  assumption,  it 

becomes  necessary  to  specify  the  distribution  of  (eit>e2t:^  UP  to  an 

unknown  parameter  vector,  and  the  functional  forms  for  T  and  II  .   The 

following  example  will  illustrate  this .   Consider  the  problem  of 

obtaining  a  consistent  LS  estimate  of  3.  .   Under  the  equilibrium 

assumption  the  data  are  conditional  on  the  event  Q=D=S ,  and  therefore  a 

consistent  "equilibrium"  LS  estimate  requires  E(x  e.  |Q  =D  =S  )=0 ,  or 

equivalently  E(x  e,  )=0  since  Q  =D  =S   is  a  sure  event  by  assumption. 

Under  the  disequilibrium  assumption,  by  contrast,  each  observation  is 

conditional  on  either  D^  <S   or  D  >S__,  and  therefore  a  consistent 
t  t     t  t 

"disequilibrium"  LS  estimate  requires  E(x  e.  |D  <S  )  =  E(x  e.  |D  >S  )=0. 
But  since,  for  example,  D  <S   is  generally  not  a  sure  event,  the 
condition  E(x  e.  |D  <S  )  =  0  is  not  equivalent  to  E(x  e.  )=0.   On  the 
contrary  E(x  e1  |D  <S  )  will  generally  be  nonzero  even  if  E(x  e.  )=0 
holds.   As  a  result,  the  LS  estimator  must  be  corrected  for  the  nonzero 
conditional  correlation  between  x  and  e,  ;  that  is,  parametric  models 
must  be  specified  for  E(x  e   |D  <S  )  and  E^xtelt  lDt>st^ •   For  example, 
suppose  we  specify 
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a  . 
el 


(1.3) 


Qt  "    VW    s«ln(Dt,St),  (1.4) 

Apt+1   =    nt(Dt,St)    5«(Dt-St),    a>0,  (1.5) 

and  for  the  first  n..    observations  we  have   Ap        O.      Then  Qt=Dt<St  f°r 
t=l,...,n1    by  equations   (1.4)   and   (1.5),    and  consistent  LS   estimates   can 
be  obtained  by  solving   the   problem 


min  {^1(Qt-61Xt  -  E(£lt  |Qt<St;  6^0^,62,  a2£2))2} 

(Bl'ael    'B2'4   }    ^ 


The  functional  form  for  the   "correction"   term  E(e      |Q   <S    ;.)   follows 

3 
from   (1.3)   and  is  given  by 

$CCB2V6ixt)/c°2Ei+4)*) 

where  <J>( . )  and  $( . )  denote  the  standard  normal  density  and  distribution 
function.   Without  a  priori  restrictions  the  specification  of 
assumptions  (1.3),  (1.4),  and  (1.5)  is  arbitrary,  but  obviously  crucial 
to  the  LS  estimation  of  the  parameters.   For  example,  given  what  is 
known  about  most  markets,  some  alternative  assumptions  that  are  just 
plausible  as  (1.3),  (1.4),  and  (1.5)  are  (1)  any  nonnormal  symmetrical 
distribution  for  the  error  terms,  (2)  Q  Snin(D  ,S  )  and  (3) 
Ap   =a(D  -S  )  +  e«  ,  where  e,  is  a  random  error  term.  Alternatively, 
one  could  derive  the  likelihood  function  of  (Qt>xt)t=i'  an(*  obtain  the 


ML  estimates.   Once  again,  however,  the  estimates  are  subject  to  the 
validity  of  restrictive  assumptions. 

Empirical  studies  of  markets  in  disequilibrium  are  concerned  with 
analyzing  time-series  data,  and  therefore  the  possibility  of  serially 
correlated  errors  also  arises.   Most  disequilibrium  studies,  however, 
completely  ignore  the  possibility  of  serial  correlation.   One  reason  for 
this  practice  is  that  maximizing  the  correct  likelihood  function  for  a 
typical  disequilibrium  model  with  even  a  simple  form  of  serial  correla- 
tion can  be  intractable.   The  problem  is  one  of  introducing  further 
complications  into  a  highly  nonlinear  structure.   (Equilibrium  models, 
by  contrast,  have  simpler  structures,  and  therefore  it  is  relatively 
straightforward  to  incorporate  an  ARMA  process  (say)  into  these  models.) 
The  problem  is  further  complicated  when  the  true  form  of  the  serial 
correlation  is  unknown;  even  if  one  was  willing  to  incorporate  a  simple 
process  such  as  AR(1),  the  result  would  likely  to  be  a  questionable 
approximation  at  best.  At  the  same  time,  failure  to  adequately  account 
for  serial  correlation  can  cause  inconsistent  covariance  estimates,  and 
incorrectly  interpretated  test  statistics. 

In  summary,  the  estimation  of  markets  in  disequilibrium  has  been 
severely  limited  by  the  problems  of  specifying  (1)  price  adjustment; 
(2)  serial  correlation;  (3)  the  distributions  of  the  error  terms  up  to 
an  unknown  parameter  vector;  and  (4) the  quantity  transacted  as  a 
function  of  desired  supply  and  demand. 

1.2  Solutions 

This  thesis  addresses  the  above  problems  by  examining  their 
effects,  and  by  proposing  and  demonstrating  solutions. 


Chapters  2,  3,  and  4  are  directed  at  the  problems  of  specifying 
price  adjustment,  and  specifying  serial  correlation.   In  Chapter  2,  we 
propose  using  the  switching  regression  model  with  imperfect  sample 
separation  of  Lee  and  Porter  (1984)  to  incorporate  price  adjustment  into 
disequilibrium  models.   The  model  enables  price  adjustment  to  be 
incorporated  with  less  a  priori  information  than  usual.   To  estimate  the 
model,  ML  and  LS  estimators  are  proposed. 

In  Chapter  3,  the  asymptotic  properties  of  the  ML  estimator  are 
examined  in  the  context  of  possible  serial  correlation.   This  chapter 
builds  on  previous  results  of  Hartley  and  Mallela  (1977),  and  Amemiya 
and  Sen  (1977).   By  incorporating  into  their  results  some  recent 
developments  in  modeling  serial  correlation  by  White  and  Domowitz  (1984) 
and  others,  the  analysis  permits  the  data  to  be  characterized  by  unknown 
and  general  forms  of  serial  correlation.   At  the  same  time,  the 
estimation  problem  remains  computationally  tractable. 

In  Chapter  4,  the  practical  importance  of  the  methodology  developed 
in  chapters  2  and  3  is  illustrated  with  an  empirical  example.   The 
methodology  is  applied  to  monthly  data  on  the  U.S.  commercial  loan 
market  from  1979  to  1984. 

The  final  chapter,  Chapter  5,  proposes  semiparametric  models  and 
estimators  for  markets  in  disequilibrium.   Unlike  the  previous  chapters, 
the  results  of  Chapter  5  apply  when  the  functional  forms  of  the  error 
distribution  functions  are  unknown,  and  the  observed  quantity  transacted 
is  an  unknown  function  of  desired  supply  and  demand.   Consistent 
semiparametric  estimators  are  derived  by  extending  the  method  of  maximum 
score  of  Manski  (1975,  1985)  to  a  new  class  of  applications. 


Although  the  focus  is  on  the  disequilibrium  estimation  problem, 
many  of  the  issues  addressed  are  applicable  to  other  important  problems 
as  well.   From  a  general  perspective,  the  central  issue  is  how  to  deal 
with  an  estimation  problem  characterized  by  less  information  than  what 
is  usually  assumed.   The  methodology  with  which  we  confront  the  issue 
brings  together  important  works  from  the  areas  of  limited  dependent 
variables,  nonlinear  estimation,  asymptotic  theory,  data  analysis, 
maximum  likelihood,  least  squares,  and  semiparametric  estimation. 


NOTES 

One  notable  difference  among  many  proposed  disequilibrium 
specifications  is  the  treatment  of  price  adjustment.   Different  price 
adjustment  models  often  produce  different  coefficient  estimates  and 
inferences  for  given  supply  and  demand  equations.   Most  studies  assume 
normally  distributed  error  terms,  and  that  the  quantity  transacted  is 
the  minimum  of  desired  supply  and  demand.   Surveys  of  disequilibrium 
specifications  commonly  used  in  applied  work  can  be  found  in  Bowden 

(1978)  and  Maddala  (1983). 

2 
General  discussions  of  LS  and  ML  estimators  for  disequilibrium 

models  can  be  found  in  Bowden  (1978)  and  Maddala  (1983). 

3 
The  random  variable  e.   conditional  on  D  <S  has  a  truncated 

normal  distribution.   Formulae  for  means  and  variances  of  truncated 

random  variables  can  be  found  in  Maddala  (1983,  pp.  365-370). 


CHAPTER  2 

A  GENERAL  DISEQUILIBRIUM  MODEL  AND  ESTIMATORS  FOR  LIMITED 

A  PRIORI  PRICE  ADJUSTMENT  INFORMATION 


2.1  Introduction 

Price  adjustment  has  a  well  defined  role  in  the  equilibrium  model: 
prices  adjust  to  clear  the  market;  data  are  observed  only  after 
adjustments  terminate,  and  therefore  are  uninformative  on  the  forces 
which  led  to  equilibrium.  When  we  assume  prices  clear  the  market, 
modeling  price  adjustment  is  trivial.   In  contrast,  when  we  assume 
disequilibrium,  and  therefore  observe  adjustments  at  various  stages, 
modeling  the  process  becomes  nontrivial  and  affects  the  estimation  of 
the  supply  and  demand  model.   The  research  that  has  followed  Fair  and 
Jaffee  has  given  this  issue  only  limited  attention.   To  lessen  the 
neglect  the  present  chapter  examines  the  importance  of  price  adjustment 
to  estimation,  and  offers  a  new  approach  for  introducing  price 
adjustment  into  the  disequilibrium  model. 

The  estimation  of  a  disequilibrium  model  carries  the  reservation 
that  estimates  are  sensitive  to  the  price  adjustment  specification.  This 
sensitivity  is  evident  in  many  of  the  empirical  studies  which  followed 
Fair  and  Jaffee.   For  example,  Maddala  and  Nelson  (1974)  obtained  the 
maximum  likelihood  (ML)  estimates  of  a  housing  market  in  disequilibrium 
under  two  different  price  adjustment  specifications, 

PA1 .  the  sign  of  excess  demand  is  given  by  the  direction  of  the  price 
change,  or  equivalently 

Pr(Apt+1>0  |D  >S  )  =  1  and  Pr( Apt+1>0  |Dt<St)  =  0; 

10 
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PA2.  ignore  whatever  information  the  direction  of  the  price  change 

contains  on  excess  demand.   (This  is  a  limited-information  approach 
as  no  attempt  is  made  to  model  price  adjustment.)   In  the  next 
section,  we  shall  see  that  this  specification  can  be  usefully 
viewed  as  imposing  the  following  constraint: 
Pr(Apt+1>0|Dt>St)  =  Pr(Apt+1>0|Dt<St). 

For  the  two  sets  of  estimates  the  following  conflicts  are  apparent:  one 
estimated  coefficient  is  negative  under  PA1  and  positive  under  PA2; 
another  is  statistically  significant  under  PA2  but  not  under  PA1 ;  the 
estimated  variance  of  the  supply  error  term  is  twenty-five  times  larger 
under  PA2 ,  and  the  same  parameter  for  the  demand  equation  is  ten  times 
larger. 

Economic  theory  imposes  few  restrictions  on  the  dynamics  of  price 
adjustment,  and  consequently  provides  little  basis  for  choosing  between 
specifications  such  as  PA1  and  PA2.   Perhaps  this  explains  why  in  many 
studies  the  Fair-Jaffee  models  are  applied  rather  mechanically  with  no 
discussion  of  why  a  particular  choice  is  appropriate  for  a  given  market. 
The  tendency  has  been  either  to  specify  convenient  but  restrictive 
price  adjustment  mechanism  such  as  PA1 ,  or  to  ignore  potential  relations 
between  price  and  excess  demand  as  in  PA2.   Apart  from  the  potential  for 
conflicting  results,  each  approach  has  serious  drawbacks.   The 
restrictive  approach  may  misspecify  the  model,  and  therefore  lead  to 
inconsistent  estimates  of  the  supply  and  demand  parameters.   On  the 
other  hand,  if  there  is  some  interaction  between  price  and  excess 
demand,  then  efficiency  will  be  lost  if  price  adjustment  is  completely 
ignored.   In  short,  even  if  the  estimates  under  PA1  are  close  to  those 
under  PA2,  problems  remain. 
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The  failure  of  many  empirical  studies  to  adequately  represent  price 
adjustment  stems  from  a  failure  to  carefully  assess  what  is  known  a 
priori.   For  most  applications  PA1  imposes  too  much  structure,  and  PA2 
imposes  too  little.   What  is  needed  is  an  approach  which  allows  price 
and  excess  demand  to  interact,  but  at  the  same  time  is  unrestrictive. 

I  propose  nesting  PA1  and  PA2  in  a  more  general  model  using  a 
method  suggested  by  Lee  and  Porter  (1984).   In  many  respects  the 
approach  is  less  restrictive  than  usual.   Price  adjustments  are  assumed 
to  be  governed  by  the  following  condition: 

PA3.  The  direction  of  the  price  change  is  most  likely,  but  not  certain 
to  follow  the  direction  of  excess  demand,  or  equivalently 

Pr^Pt+X^IVV  -  ^^Pt+l^l^V' 

Although  PA3  allows  for  the  possibility  that  excess  demand  influences 
price  changes,  it  does  not  restrict  the  direction  of  the  price  change  to 
correspond  to  the  sign  of  excess  demand,  impose  a  specific  price 
adjustment  equation,  or  restrict  price  changes  to  obey  a  known 
probability  distribution.   The  approach  entails  estimating  the 
conditional  probabilities  in  PA3,  and  hence  the  data  rather  than  a 
priori  constraints  such  as  PA1  or  PA2  determine  to  what  extent  prices 
are  related  to  excess  demand.  Moreover,  the  problem  of  modeling  price 
adjustment  is  placed  in  a  unified  framework  which  permits  a  useful 
discussion  of  the  relationship  between  the  price  adjustment 
specification,  and  the  statistical  properties  of  estimation.   PA1  and 
PA2  are  special  cases  of  PA3,  and  it  is  argued  that  imposing  PA1  can 
lead  to  inconsistent  estimates,  while  imposing  PA2  can  suppress 
exploitable  information  on  the  supply  and  demand  parameters. 
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The  model  and  its  maximum  likelihood  estimator  are  discussed  in 
section  2.2.   In  section  2.3  a  convenient  least  squares  approach  is 
proposed  which  has  not  been  previously  available  for  disequilibrium 
models.   The  LS  estimator  resembles  that  suggested  by  Heckman  (1976)  for 
the  Tobit  model.  Although  the  ML  estimator  presented  in  section  2.2  is 
more  efficient,  the  LS  estimator  is  easier  to  compute,  and  provides 
consistent  starting  values  if  the  ML  estimates  are  desired.  An  initial 
consistent  estimator  is  especially  important  when  PA1  is  relaxed  since 
the  resulting  likelihood  generally  has  multiple  solutions. 

2.2  The  Model  and  Maximum  Likelihood  Estimation 
I  propose  the  following  model: 

D     =    g°x     +   e  (demand) 

S     =    6°xt  +   e2t  (supply) 


(Blt,e2t)    -N 


°]  Ri  ° 

0/  ,    .0  a2 


(normality) 


e2 


0  =  min(D  ,S  )  (quantity  transacted) 

t        t  t 

Pr(Apt+1>0|Dt>St)  >  Pr(Apt+1>0|Dt<St).  (PA3) 

where  the  variables  are  as  they  were  defined  in  Chapter  one.   The 
specification  of  the  demand  and  supply  equations,  normality  for  the 
error  terms,  and  the  quantity  transacted  as  the  minimum  of  supply  and 
demand  has  become  standard  practice  for  empirical  studies  of  markets  in 
disequilibrium.   The  model  differs  from  previous  disequilibrium 
specifications  with  the  introduction  of  PA3:   shortages  (D>S)  and 
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surpluses  (D<S)  are  assumed  to  bias  price  changes  upward  and  downward 
respectively,  but  in  an  unpredictable  manner.   The  conditional 
probabilities  are  unknowns  which  can  assume  any  values  in  the  unit 
interval  that  satisfy  the  inequality. 

The  data  consists  of  n  observations  on  (Q  ,  x  , 1( Ap  +1 >0)) ,  where 
1(.)  is  the  indicator  function,  and  the  problem  is  to  estimate  the 
unknown  supply  and  demand  parameters  along  with  the  conditional 
distribution  of  l(Ap  , , >0)  subject  to  PA3. 

To  make  the  problem  operational  we  will  adopt  the  methodology  of 
Lee  and  Porter  (1984)  which  entails  the  following  assumptions: 

Assumption  2.1.   Given  D  >S   (or  D  <St),  Qt  and  l(Apt+1>0)  are  mutually 

independent  for  all  t; 
Assumption  2.2.   the  conditional  probabilities  of  PA3  do  not  vary  with 

t;  i.e.,  pn  EPr(Apt+1>0  |  VVV'  P10  =Pr(APt+1>°  I  VVV* 

Assumption  2.2  is  the  simplest  assumption  that  allows  the  price 
adjustment  probabilities  to  be  treated  as  estimable  parameters,  but  is 
not  the  only  possible  way  of  doing  so.   For  example,  if  it  is  suspected 
that  price  setting  behavior  differs  between  certain  subsamples,  then  a 
different  pair  of  parameters  could  be  defined  for  each.   One  possible 
application  might  be  a  market  where  prices  are  regulated  in  some 
periods,  but  not  in  others.  Alternatively,  a  completely  varying-para- 
meter  approach  is  developed  in  Chapter  5.  Although  assumptions  2.1  and 
2.2  are  still  somewhat  restrictive,  arguably  the  benefits  obtained  from 
imposing  them  outweigh  the  costs.   Price  adjustment  is  incorporated 
without  an  explicit  adjustment  equation,  a  specific  distribution  for 
price  changes,  or  the  restriction  that  price  changes  reveal  the  sign  of 
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excess  demand.   Furthermore,  as  we  shall  see  next,  estimation  is 
relatively  straightforward  under  assumptions  2.1  and  2.2. 

The  log  likelihood  function  of  n  independent  observations  on 

2      2 
(Qt,l(Apt+1>0)  |  xt,e,p11,p10),  where  G=  (  ^  ,  o^  ,  B2 ,  o^) ,  is 

VQ'pii'pio)  =  l  los  W^t+i*0"  (2,3) 

where 


ft(Qt,HAPt+1>0))  =  (p11.gst+P108dt) 
((1-P11)gst+(1-P10)gdt) 


KApt+1>0) 


K&Pt+1^)) 


i.t'i^M'    gdt  =  /  st(Vst)dSt> 


"t 


and  g  (D  ,S  )  is  the  joint  density  of  D  and  S  given  x  and  0.   Under 
fairly  general  conditions,  a  consistent  and  asymptotically  normal 
estimate  of  (0O,P?15P?q)  can  be  obtained  by  maximizing  Ln  over  an 
appropriate  parameter  space.   (The  asymptotic  properties  of  a  maximizer 


of  L  are  developed  in  the  next  chapter . ) 


The  Maddala-Nelson  estimators  discussed  in  section  2.1  are  obtained 
by  maximizing  L  subject  to 

(PAD:  (pirP10)  =  (1,0); 
(PA2):  pn=p10. 

As  was  noted,  however,  applying  these  two  estimators  to  a  given  data  set 
can  produce  conflicting  results.   One  advantage  of  specifying  PA3  is 
that  the  parameter  space  includes  the  entire  region  (p-jj>  P^o: 
lgp   >p  na0),  and  consequently  it  is  not  necessary  to  choose  between 
PA1  and  PA2. 
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By  viewing  the  Maddala-Nelson  estimators  as  constrained  maximizers  of 

L  ,  two  additional  limitations  that  are  overcome  by  specifying  PA3 
n 

can  be  seen.   First,  if  the  direction  of  the  price  change  does  not 
always  follow  the  sign  of  excess  demand  so  that  p,,<l  or  P-^q^j  then  the 
estimator  obtained  by  maximizing  L  subject  to  (p..,p.~)  =  (1,0)  is 
inconsistent.   In  other  words,  if  it  is  incorrectly  assumed  that 
l(Ap   >0)  separates  the  sample  into  the  underlying  demand  (Qt=Dt)  and 
supply  (Q  =S  )  regimes,  then  the  resulting  estimates  will  be  generally 
inconsistent.   To  see  this  denote  the  constrained  estimator  by  0  (1,0), 
and  suppose  that  e1  , e„  are  normally  distributed  independent  random 
variables,  and  P?^!)  P-m^*   Tnen  it:  is  shown  in  Appendix  A.l  that 

E(9Ln(9°;l,0)/861)  =  (1-p^)   ^  xtE(Qt"Dt) /o£l '  (2>4) 

Since  Pr(D  >S  )>0,  and  Pr(Dt2$t  =  min  (Dt,St))  =  1 ,  we  have  E(Qt-Dt)<0, 
and  it  follows  from  (2.4)  that  in  general  plim  0  (1,0)*9°.   (For  further 
details  see  Appendix  A.l.) 

The  second  limitation  overcome  by  maximizing  L  over  the 
unconstrained  space  demonstrates  the  importance  of  incorporating  price 
adjustment  into  the  model.   If  price  changes  are  related  to  excess 
demand  so  that  p^^p^,  then  the  observations  on  l(Ap  .>0)  contain 
information  on  0°  that  is  exploited  by  the  maximizer  of  L  only  if  the 
restriction  p1  =plf)  is  not  imposed.   Since  imposing  Pii=Piq  is  equi- 
valent to  estimating  the  model  without  a  price  adjustment  specification, 
this  implies  that  one  is  better  off  using  even  limited  amounts  of  price 
adjustment  information  rather  than  neglecting  it  altogether.   This  can 
be  seen  by  examining  the  difference  between  the  corresponding 
information  matrices  of  the  constrained  and  unconstrained  estimators  of 


17 


0°.   For  this  purpose  note  that  Pii^Piq  implies  that  Q  and  l(Apt+1>0) 
are  independent,  the  marginal  distribution  of  l(Ap   ,  >0)  does  not  depend 
on  0°,  and  therefore  the  0  -estimator  obtained  by  maximizing  L  subject 
to  p.^p.p.  can  be  written  as 

n 

9,CPn=Pin^  =  ar§  max   E  lo8  S^QJ, 
n  li   iu  Q  t=1 

where  g  is  the  density  of  Q  given  xt.   Since  0n(P;Q=Pio^  does  not 
require  the  joint  observation  (Q  ,  l(Ap   ,>0)),  it  uses  one  more 
observation  on  Q  than  the  0  -estimator  obtained  by  maximizing  L  over 
the  unconstrained  space,  and  therefore  we  write  the  latter  estimator  as 

n-1 

0n-l(pll^?lO)  =  "»  maX   ,   Z,l0g  ft(Qt'UApt+l>0)) 

(05PnJP1O^   t-1 

Unlike  0  (Pii^n)'  the  estimator  0n-l^pH  ;Sp10')  USeS  the  Price 
adjustment  information  implied  by  Pii*Piq>  namely  the  dependence  of 
l(Ap   >0)  and  Q  .   For  simplicity  suppose  that  the  observations  are 
identically  distributed.   The  trade-off  between  the  extra  observation  on 
Q  used  in  0  (pn=plf)),  and  the  price  adjustment  information  exploited  by 

0  i(Pn*Pin)  is  apparent  in  the  difference  between  their  corresponding 
n-1   11   1U 

information  matrices: 

(n-l)E(-a2logf/9030')  -  nE(-32logg/ 3030' )  = 
(n-l)E(-32logh/3030*)  -  E(-32logg/3030* ) 

where  h  is  the  density  of  KAp   ,  >0)  given  (Qt,xt).   In  large  samples 
the  information  provided  by  the  extra  observation  on  Q  in  Qn^Pn=Pio^  is 
insignificant,  and  clearly  0  .i^PiiTiq^  iS  the  m°re  efficient 
estimator. 
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Having  developed  a  fairly  unrestrictive  approach  for  introducing 
price  adjustment  into  the  disequilibrium  model,  an  important  question 
remains:   is  maximization  of  L  over  (  0,p..  ..  ,p.q)  computationally- 
tractable?  This  question  is  important  given  a  common  structure  shared 
by  both  L  with  (p   ,p  „)  unrestricted,  and  Ln  restricted  by  P11=P1q: 
neither  specification  permits  the  observations  to  be  separated  into  the 
underlying  supply  and  demand  regimes,  and  hence  both  are  switching 
regression  models  with  unknown  sample  separation.   The  question  of 
tractability  arises  because  likelihood  functions  of  unknown  sample 
separation  models  generally  have  an  unknown  number  of  local  maxima,  and 
finding  the  consistent  and  asymptotically  normal  estimate  (global 
maxima)  usually  requires  an  exhaustive  set  of  local  candidates.   For 
example,  Maddala  and  Nelson  found  that  three  different  starting  values 
produced  three  different  sets  of  estimates,  and  were  not  able  to  rule 
out  the  possibility  of  other  solutions.   Unfortunately,  the  extra 
information  provided  by  the  joint  observation  (Q  ,l(p   ->0))  does  not 
automatically  eliminate  the  problem;  in  general,  L  is  likely  to  have 
multiple  solutions.   Fortunately,  the  problem  can  be  circumvented.   If 
one  is  willing  to  assume  that  Pii^iO'  then  it:  is  Possible  to  construct 
a  computationally  simple  and  consistent  estimator  of  ( 0,p,  ,  ,p,  q) ,  and 
therefore  obtain  consistent  starting  values  to  iterate  to  a  local  maxima 

of  L  .   The  consistency  of  the  initial  estimates  generally  guarantee  the 
n 

2 
consistency  and  asymptotic  normality  of  the  resulting  solution.   The 

next  task  is  to  describe  the  initial  estimator. 

2.3.  A  Consistent  Initial  Least  Squares  Estimator 

While  computationally  simple  and  consistent  estimators  have  been 
proposed  for  other  limited  dependent  variable  models  such  as  the  Tobit, 
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similar  results  have  not  been  previously  available  for  the  disequili- 
brium model  with  unknown  sample  separation.   Ironically,  the  models  for 
which  such  estimators  have  been  available  generally  possess  tractable 
likelihood  functions,  and  therefore  finding  consistent  initial  estimates 
is  of  limited  value.   A  prime  example  is  the  Tobit  model  for  which 
consistent  initial  estimators  were  proposed  by  Amemiya  (1973),  and 
Heckman  (1976);  their  estimators  are  not  particularly  useful  for  the 
Tobit  as  this  model  has  a  globally  concave  likelihood  function  (when 

suitably  parameterized)  which  ensures  convergence  to  the  consistent  and 

3 
asymptotically  normal  maximizer  from  any  starting  values.   In  contrast, 

the  likelihood  functions  of  models  with  unknown  sample  separation  are 

likely  to  have  multiple  maxima,  and  therefore  finding  initial  consistent 

estimates  for  these  models  is  crucial. 

The  estimator  described  below  extends  the  approaches  suggested  by 

Amemiya  and  Heckman  to  disequilibrium  models  with  unknown  sample 

separation.   The  method  requires  the  first  moment  of  l(Ap.+,  >0) 

(t=l,...,n),  and  the  first  and  second  moments  of  Q  (t=l,...,n).   Least 

squares  is  then  applied  successively  to  three  estimation  equations. 

Assuming  that  e,   and  e„  are  independent  normally  distributed  random 

variables,  the  relevant  equations  are 


l(Apt+1>0)   =  El(Apt+1>0)   +  ulfc  (2.5) 

Qt  =  E(Qt)   +  u2t  (2.6) 

Q2t  =  E(Q2)   +  u3t  (2.7) 


where 


E(KApt+1X)))    =   p°n   -   (V°n   ~  p°Q)    $(xtY0),  (2.8) 

E(Qt)   =   (l-$(xtY°))xte°  +   $(xtY0)xt$°  -   (0°J  +   o°j)%(xtY°)  ,(2.9) 
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E(Q^)  =  (l-$(xtY0))(xt62)2  +  $(xtY°)Ut8°)2 

+  a°e2(l-$(xtY0)  -  2xt^*CxtY°)/(a0eJ  +  a^)1) 

+  a^O(xtY  )  -  2xt31(()(xtY  "(aei  +   ag)    J 

+  ^xtY0)xtY°  (2-10) 

Y°  =  (Bo-3?)/(a°2  +  a°J)J,  and  4(0  and  $(0  denote  the  standard  normal 
z   I    ei     ez 

density  and  c.d.f.,  respectively.  Given  appropriate  regularity 
conditions,  nonlinear  LS  applied  to  equation  (2.5)  yields  consistent 
estimates  of  p°  ,  p°n,  and  y°«  These  estimates  are  then  used  to  estimate 
the  nonlinear  functions,  <f>  and  $,  in  equation  (2.6).   Ordinary  LS  can 
then  be  applied  to  (2.6)  to  consistently  estimate  8-i>  Bo  and  (a  .  + 

a°f)  .   Finally,  the  nonlinear  functions  of  equation  (2.7)  are  estimated 

r  o2   j  o2 
so  that  OLS  can  be  applied,  and  consistent  estimates  or  a  j  ana  a  « 

obtained.   The  asymptotic  properties  of  the  LS  estimator  are  developed 

in  appendix  A. 2. 

Interestingly,  the  above  approach  is  possible  only  if  price  changes 

provide  some  information  on  whether  there  is  excess  demand  or  supply; 

i.e.,  p?. *P?n«   This  can  be  seen  from  equation  (2.8)  which  can  be 

interpreted  as  the  probability  that  l(Ap   ,>0)  is  equal  to  one.   If 

price  changes  are  completely  uninf ormative  on  excess  demand  or  supply, 

then  p°  =p°  ,  and  it  follows  from  (2.8)  that  the  distribution  of 

l(Ap   >0)  is  independent  of  y  •   In  tllis  case  the  observations  on 

l(Ap   >0)  contain  no  information  on  the  supply  and  demand  parameters, 

and  therefore  equations  (2.5)  is  irrelevant  for  the  estimation  of  the 

model . 
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2.4  Summary  and  Conclusions 

The  main  points  of  this  chapter  are 

(1)  Estimates  of  disequilibrium  models  are  sensitive  to  the  price 
adjustment  specification. 

(2)  Economic  theory  imposes  few  restrictions  on  price  adjustment,  and 
consequently  provides  little  basis  for  choosing  between  specifications. 

(3)  Assumption  PA3  serves  as  an  unrestrictive  approach  for  introducing 
price  adjustments  into  disequilibrium  models;  adjustment  enters  without 
an  explicit  adjustment  equation,  a  known  probability  distribution  for 
price  changes,  or  the  restriction  that  price  changes  reveal  the  sign  of 
excess  demand. 

(4)  Assumption  PA3  together  with  assumptions  2.1  and  2.2  permit  a 
straightforward  derivation  of  the  likelihood  function.   The  parameter 
space  includes  but  is  not  limited  to  the  important  special  cases  PA1  and 
PA2 .   Constraining  the  parameter  space  to  PA1 ,  as  is  often  done  in 
practice,  can  lead  to  inconsistent  estimation;  constraining  the  space  to 
PA2  produces  inefficient  estimates. 

(5)  Under  assumption  PA3  the  disequilibrium  model  is  one  of  unknown 
sample  separation,  and  therefore  its  likelihood  function  generally  has 
multiple  solutions.   To  resolve  the  problem  of  multiple  solutions,  the 
least  squares  method  described  in  section  2.3  provides  consistent 
initial  estimates. 

In  Chapter  4  we  apply  the  methodology  developed  in  the  present 
chapter  to  monthly  data  on  the  U.S.  commercial  loan  market.   Before 
proceeding  to  the  application,  however,  the  problem  of  serial 
correlation  must  be  addressed.   In  the  next  chapter  we  develop  some 
results  which  permit  the  data  to  be  analyzed  in  the  context  of  possible 
serial  correlation. 
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NOTES 

There  is  an  important  difference  between  the  model  Lee  and  Porter 
(1984)  discuss,  and  our  model.   The  Lee-Porter  model  excludes  an  analog 
to  Q  =min(D  ,S  ),  and  consequently  in  their  model  the  switching  is 
exogenous;  i.e.,  the  switching  that  occurs  between  the  underlying 
regimes  is  independent  of  the  error  terms.   In  contrast,  the 
disequilibrium  model  is  of  endogenous  switching,  (the  "switch"  depends 
on  (e,  , e-  )),  and  consequently  many  of  the  results,  interpretations, 
and  expressions  found  in  the  Lee-Porter  paper  must  be  modified 

accordingly. 

2 
In  fact,  given  appropriate  regularity  conditions,  consistent 

initial  estimates  ensure  the  consistency  and  asymptotic  normality  of  the 

second-round  estimates  from  a  Newton-Raphson  type  algorithm.   See,  for 

example,  Amemiya  (1973,  pp.  1014-15). 

3 
Olsen  (1978)  proved  that  the  likelihood  function  for  the  Tobit 

model  is  globally  concave  when  suitably  parameterized,  and  thus  has  a 

single  maximum. 
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CHAPTER  3 
SOME  ASYMPTOTIC  THEORY  FOR  SERIALLY  DEPENDENT  OBSERVATIONS 


3.1  Introduction 
In  this  chapter  we  examine  the  asymptotic  properties  of  the 

estimator  discussed  in  section  2.1, 

^      =  are  max  L  ( Q  ) , 

Q   n  p 

P 

where  0  =  (Ojp.^p.J,  and  L  (6  )  is  defined  on  page  14,  equation 
p       11   1U        n  p 

(2.3).   If  the  observations  are  serially  independent,  then  obviously 

is  the  MLE  of  0°.   However,  for  serially  dependent  observations,  0~  is 
p  n 

not  the  MLE  and  will  be  referred  to  as  the  partial-MLE. 

Hartley  and  Mallela  (1977),  and  Amemiya  and  Sen  (1977)  derive 
asymptotic  properties  of  the  MLE  for  the  special  case  of  Pii^lO'   We 
will  extend  their  results  to  the  case  of  serially  dependent  observations 
and  p   ^.^  in  sections  3.2  and  3.3.   In  section  3.4  we  consider  the 
problem  of  consistently  estimating  the  asymptotic  covariance  matrix.   In 
section  3.5  we  derive  a  new  test  for  serial  correlation. 

3.2  Consistency 

Since  disequilibrium  models  are  typically  estimated  with  time 
series  data,  it  is  of  interest  that  the  property  of  consistency  can  be 
extended  to  the  partial  MLE.   Using  some  results  and  definitions 
presented  by  White  and  Domowitz  (1981),  Levine  (1983)  has  discussed  how 
and  why  a  partial  MLE  can  be  consistent.   Levine  points  out  that  the 
consistency  of  an  estimator  0  (y)  which  maximizes  the  product 
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Il^f  (y  |0)  depends  on  each  f  (y  |  0)  satisfying  certain  regularity 
conditions.   In  general,  whether  or  not  f^Cy^  |o)  satisfies  such 
conditions  does  not  depend  on  the  product  being  the  joint  density  of  y  = 
(y  ,...,y  ),  but  rather  it  usually  suffices  that  f  (  •  |  0)  is  the  marginal 
density  of  y  .   The  regularity  conditions  consist  of  identification 
conditions,  and  moment  restrictions  sufficient  to  apply  an  appropriate 
law  of  large  numbers.   We  will  show  that  the  partial  MLE  for  our  model 
can  satisfy  such  conditions  by  extending  some  results  proven  by  Hartley 
and  Mallela,  and  Amemiya  and  Sen.   But  first  it  is  necessary  to  describe 
the  type  of  dependence  we  have  in  mind. 

We  will  adopt  the  nonparametric  approach  of  White  and  Domowitz 
(1984)  to  allow  for  the  possibility  of  serial  correlation.   The  approach 
of  White  and  Domowitz  is  nonparametric  in  the  sense  that  the 
observations  are  not  required  to  be  generated  by  a  known  parametric 
model  such  as  an  ARMA  (p,q)  process,  but  instead  must  obey  general 
memory  requirements.   The  memory  requirements  are  referred  to  as  mixing 
conditions,  and  a  sequence  of  random  variables  which  obey  mixing 
conditions  is  said  to  be  a  mixing  sequence.   More  precisely,  we  have  the 
following  definition. 

Definition  3.1 .   Let  (y  )  denote  a  sequence  of  random  vectors  defined  on 
a  probability  space  (Q,F,P),  and  let  F  denote  the  Borel  crfield  of 
events  generated  by  the  random  variables  y  »y  . , . . . ,y, .   Define 

<f,(m)  =  sup(sup|P(A|B)-P(A)  |:  AeF^.BefM)  and 


a(m)  =  sup(sup(  |P(BA)-P(B)P(A)  |:  AeF°°  .BeF^)) 
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(i)   If  (J>(m)-»0  as  m-»-«,  and  <{)(m)=0(m  )  for  k>r/(2r-l),  where  r>l,  then 

(y  )  is  a  mixing  sequence  with  <j)(m)  of  size  r/(2r-l). 

(ii)  If  a(m)-K)  as  m->-°°,  and  a(m)=0(m  )  for  k>r/(r-l),  where  r>l,  then 

(y  )  is  a  mixing  sequence  with  a(m)  of  size  r/(r-l). 

<j>(m)  and  a(m)  measure  how  much  dependence  exists  between  observations  at 

least  m  periods  apart.   A  sequence  such  that  <j)(m)->0  as  m-*»  is  called 

uniform  mixing  or  <ft-mixing ,  and  a  sequence  for  which  a(m)->0  as  m-*-00  is 

called  strong  mixing  or  a-mixing .   Since  the  dependence  coefficients, 

<|>(m)  and  a(m),  are  required  to  vanish  asymptotically,  mixing  is  a  form 

of  asymptotic  independence.  A  fairly  large  class  of  processes  satisfy 

mixing  conditions.   For  example,  finite  order  Gausian  ARMA  processes  are 

strong  mixing,  as  are  stationary  Markov  chains  under  fairly  general 

conditions.   White  and  Domowitz  (1984)  show  that  measurable  functions  of 

mixing  processes  are  mixing  and  of  the  same  size.   This  is  particularly 

convenient  for  nonlinear  problems.  Mixing  processes  are  useful  for 

modeling  complex  economic  data  since  they  are  not  required  to  be 

stationary.   In  short,  mixing  conditions  provide  a  convenient  way  to 

model  an  economic  phenomenon  that  is  likely  to  be  both  heterogeneous  and 

time  dependent. 

The  following  law  of  large  numbers,  due  to  McLeish  (1975),  applies 

•  •  2 

for  mixing  sequences. 

Theorem  3.2.   Let  (y  )  be  a  sequence  with  <|)(m)  of  size  r/(2r-l)  or  a(m) 
of  size  r/(r-l),  r>l,  such  that  E  |y  |r   <M<°°  for  some  d>0,  and  all  t. 


Then 


n         p 
(l/n)Z(y  -E(y  ))+0 
t=l 


All  proofs  of  theorems  in  this  Chapter  are  provided  in  Appendix  A. 3. 
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For  Theorem  3.2  to  be  applicable  to  a  given  sequence,  it  is  clear 
that  there  is  a  trade-off  between  the  moment  restriction  that  the 
sequence  must  satisfy,  and  allowable  dependence.   The  stronger  the 
moment  restriction  satisfied,  the  more  dependence  as  measured  by  <j>(m) 
(or  a(m))  is  allowed.   If  the  members  of  the  sequence  are  independent, 
then  we  can  set  r=l,  and  Theorem  3.2  collapses  to  the  Markov  law  of 
large  numbers.   For  sequences  with  exponential  memory  decay,  r  can  be 
set  arbitrarily  close  to  one.   In  general,  the  longer  the  memory  of  a 
sequence,  the  larger  is  the  size  of  <})(m)  and  a(m) ,  and  consequently  the 
more  stringent  the  moment  restriction  (which  depends  on  r)  becomes. 

By  using  mixing  conditions  to  restrict  the  serial  behavior  of  the 
sequence  (Q  , 1( Ap  ,, >0) ,x  ) ,  it  is  not  necessary  to  specify  an 
additional  parametric  model  such  as  an  ARMA  (p,q)  process. 
Consequently,  one  possible  source  of  model  misspecification  is 
eliminated.  Mixing  conditions  enable  us  to  include  a  larger  class  of 
models  in  the  analysis.   Of  course,  as  Theorem  3.2  implies,  the  precise 
size  of  the  class  will  depend  on  what  moment  restrictions  are  satisfied. 
We  are  now  ready  to  state  conditions  which  ensure  the  consistency  of  the 

partial-MLE  (and  the  MLE)  of  0°. 

v  P 

In  order  to  establish  consistency  for  the  partial-MLE  we  impose  the 
following  assumptions  on  the  disequilibrium  model  presented  in  section 
2.2: 

Assumption  3.3.  (allowable  serial  dependence):   The  sequence 

(Q  ,l(Ap  ,>0),x  )  is  a  mixing  sequence  with  <})(m)  of  size  r/(2r-l),  rsl, 

or  a(m)  of  size  r/(r-l),  r>l. 
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Assumption  3.4.  (distributions) : 

(i)  The  random  vector  ( e.  ,  e?  )  is  normally  distributed  with  mean  zero 
and  covariance  matrix: 

I  el  ! 

o2 


10        a   _  , 
\  el   / 

(ii)  Assumptions  2.1  and  2.2  hold.   (See  page  13.) 


Assumption  3.5.  (the  regressors): 

(i)   The  vector  x  consists  of  only  exogenous  variables. 

(ii)  Each  component  of  x  is  uniformly  bounded  in  t,  has  a  finite 

range  for  each  t,  and  a  support  given  by  S  =S  for  all  t. 
(iii)  Any  linear  combination  of  the  components  of  x  where  the 

coefficients  are  not  all  zero  is  not  zero  with  probability  one. 

Assumption  3.6.  (the  parameter  space): 


(i)  The  parameter  space  5  includes  the  true  parameter  vector 

~P-/ r.o        o2  „o  _  o2  o   _o  ->   i„j__  ._!_. j J- 

P 


r=^l,0el  'e2'°e2  'P11'P10^'  excludes  the  region  cj^SO  (i-1,2)  and 


p1„>p11,  and  is  a  compact  subset  of  a  Euclidean  space. 


(ii)  If  the  set  5  includes  points  such  that  p  =plf.,  then  it  excludes 

* 

3   ■ 
P 


the  point  Q=(8°,o°    ,  3°,  a  .  >P?i>P?o^'   Otherwise  5  may  include 


0  . 
P 

With  a  few  exceptions,  the  conditions  on  the  regressors  and  the 
parameter  space  are  identical  to  those  given  by  Hartley  and  Mallela 
(1977),  and  Amemiya  and  Sen  (1977).   One  exception  is  that  we  place  no 
restrictions  on  the  limiting  behavior  of  the  empirical  distribution  of 
the  regressors,  whereas  Hartley  and  Mallela  require  it  to  converge 
completely  to  a  nondegenerate  distribution.  As  pointed  out  by  White 
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(1980),  in  sampling  situations  where  the  researcher  has  little  control 
over  the  data,  it  is  important  to  allow  for  the  possibility  that  the 
empirical  distribution  does  not  converge.   In  contrast  to  Amemiya  and 
Sen,  we  do  not  require  the  regressors  to  be  i.i.d.,  but  for  convenience 
retain  their  assumption  that  the  regressors  are  discrete  random 
variables. 

Assumption  3.6(ii)  is  necessary  to  identify  the  true  parameter 

vector  0°.   Without  appropriate  prior  information  on  fL  and  3„ >  tne 

*  o 

point  0  is  indistinguishable  from  0  and  the  model  can  not  be 

estimated.   This  is  the  problem  of  interchanging  regimes  which  is 

discussed  by  Hartley  and  Mallela,  and  Amemiya  and  Sen.   Both  studies 

point  out  that  0  is  eliminated  from  the  parameter  space  if  the  usual 

"order  condition"  holds.   We  will  extend  this  result  below  by  showing 

o  * 

that  for  0  to  be  distinguishable  from  0  ,  it  suffices  to  know  a  priori 

that  p°>p°„.   In  this  sense  prior  sample  separation  information 

represents  prior  information  on  the  supply  and  demand  parameters. 

Hoadley  (1971)  has  generalized  the  Wald  argument  to  the  case  of 

independent  not  identically  distributed  observations.   Theorem  3.7  below 

is  an  extension  of  Hoadley 's  argument  to  mixing  sequences,  and  will  be 

used  to  verify  that  assumptions  3.3,  3.4,  3.5,  and  3.6  imply  consistency 

for  the  partial-MLE,  0™  . 
r  n 

Theorem  3.7.   Suppose: 

(i)   The  sequence  (y  )  is  a  mixing  sequence  with  <f>(m)  of  size  r/(2r-l), 

r£l,  or  a(m)  of  size  r/(r-l),  r>l. 
(ii)  The  parameter  space  E  is  a  compact  subset  of  a  Euclidean  space, 
(iii)  The  function  f  (y  |9)  is  continuous  on  H,  uniformly  in  t,  a.e. 
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(iv)   The  function 

supUnCf  (y  |0*)/f  (y  |e°)):  |G'-0|Sp} 

„,         t    t  t    t 

is  a  measurable  function  of  y  for  each  6  belonging  to 
(v)   There  exists  p  (0)>O,  and  d>0  such  that 

E|sup{ln(ft(yt|e')/ft(yt|0°):  |9'-o|  Sp}  |r+dSA<°° 

for  OSp^p  (0). 
(vi)  For  0*0°, 


lim  sup{n  lE  E(ln(f  (y,  |e)/f.(y.  |e0)))  }<0. 

n        .  tt     tt 

Let  0  (y)  be  a  function  of  the  observations  y=(y1 , . . . ,y  )  which 
n  in 

solves  the  problem 


max  n  f  (y  | 9) . 
0  t=l 


Then  plim  0  (y)=0°. 

To  show  that  the  partial-MLE  gP  is  a  consistent  estimator  of  0  we 

v  n  p 

verify  that  5,  (Qt, 1( Apt+1 >0) ,xt) ,  and  ft(Qt» K Apt+1 >0) | 0 )  satisfy 
3.7(i)-(vi)  given  assumptions  3.3  -  3.6. 

The  fact  that  the  mixing  and  compactness  requirements  3.7(i)  and 
3.7(ii)  are  satisfied  follows  directly  from  assumptions  3.3  and  3.6(i). 

Lemma  3.8  establishes  that  f  (Q  , 1( APfc+1>0 I 0  )  satisfies  the 
continuity  requirement  3.7(iii). 
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Lemma  3.8.   Given  assumptions  3.4  -  3.6,  f  (Q  ,1( Ap  ^ >0 |9 _)  is  a 

continuous  function  of  0  uniformly  in  t,  a.e. 
P 

Lemma  3.9  establishes  that  the  measurability  requirement,  3.7(iv), 
is  satisfied. 
Lemma  3.9.   Given  assumptions  3.3  -  3.6,  the  function 

sup{ln(ft(Qt,l(Apt+1>O)|0p)/ft(Qt:,l(Apt+1>O)|e°)):|0p-0pNp} 
is  a  measurable  function  of  (Q  , l(Ap  . >0),x  ). 

The  moment  restriction,  3.7(v),  together  with  3.7(i)  determines  the 
amount  of  dependence  allowable.   The  following  lemma  extends  Hartley  and 
Mallela's  Corollary  4.2,  and  establishes  that  3.7(v)  is  satisfied  for 
large  r+d. 

Lemma  3.10.   Given  assumptions  3.3  -  3.6,  for  all  sufficiently  small 
p=p(0)>O, 

E|suP{ln(ft(Qt,l(Apt+1>O|0p)/ft(Qt,l(Apt+1>O|0°)):|0p-0p|Sp}|kSA^. 
where  k  is  any  positive  integer. 

Finally,  Lemma  3.11  establishes  that  the  identification  condition, 
3.7(vi),  is  satisfied.   Lemma  3.11  extends  Amemiya  and  Sen's  Lemmas  2 
and  3  to  the  case  of  p   *p..„. 

Lemma  3.11  Given  assumptions  3.3   -  3.6,  for  0  *0  there  exists  a 
P  P 

negative  constant  b(0  )  such  that 

E(ln(ft(Qt,l(Apt+1>O)|0p)/ft(Qt,l(APt+1>O)|0°)))Sb(0p). 
We  have  proven  the  following  theorem. 


31 


Theorem  3.12.   Given  assumptions  3.3  -  3.6,  then  plim  Gr  -0  . 

3.3  Asymptotic  Normality 

Under  the  assumption  that  (Q  , 1( Ap   , >0) ,x  )  is  a  mixing  sequence, 
we  consider  the  limiting  distribution  of 

n~V2(0°)VL  (0°), 
n   p   n  p 

where  VL  (  0  )  denotes  the  gradient  vector  corresponding  to  L  (0  ),  and 
n  p  n  p 

V  (0°)  =  var  (n  2 VL  (0°)).   We  will  discuss  conditions  that  imply 
n  p  n  p 

asymptotic  normality;  that  is, 


A 

n  2V  2(0°)VL  (0°)^(O,I),  (3.13) 

n   p   n  p 


where  I   denotes  an  identity  matrix  of  appropriate  dimensions.   The 
results  in  this  section  together  with  those  in  the  next  section  permit 
derivation  of  asymptotic  test  statistics. 

As  is  well  known,  asymptotic  normality  is  proven  by  an  appropriate 
application  of  a  central  limit  theorem.   As  with  consistency,  the 
conditions  sufficient  for  asymptotic  normality  depend  on  the  degree  of 
dependence  and  heterogeneity  the  sequence  exhibits.   For  a  sequence  of 
independent  identically  distributed  random  vectors,  we  have  the 
Lindeberg-Levy  Theorem;  for  independent  not  identically  distributed  we 
have  the  Lindeberg-Feller  Theorem;  for  dependent  identically  distributed 
we  have  the  central  limit  theorem  of  Gordin  (1969);  for  dependent  not 
identically  distributed  we  have  the  central  limit  theorem  of  Serfling 
(1968). 

For  the  case  of  independent  observations,  Hartley  and  Mallela 
(1977)  prove  the  asymptotic  normality  result  (3.13)  by  applying  a 
version  of  the  Lindeberg-Feller  Theorem.   However,  by  specifying  the 
sequence  (Q  ,l(Ap   >0),x  )  as  mixing,  a  more  general  result  is 
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possible.   The  following  theorem  is  based  on  Theorem  2.4  of  White  and 
Domowitz  (1984)  which  generalizes  Serf ling's  (1968)  central  limit 
theorem. 

Theorem  3.14.   Suppose: 

n 

(i)   Let  VL  (0°)  =  E  VL  (0°).   Then  E(VL  (0°)=O  for  all  t. 
n  p   t=1   t  p  t  p 

(ii)  Let  A  be  any  nonzero  vector,  and  define 

n+a 

E 

t=l+a 

Then  there  exists  a  matrix  V  such  that  det(V)>0,  and 


VL        (0  )   =  n  1      E      VL   (0  ) 
n,a     p  „..        t     p 


AE(VL        (0°)VL        (0°)T)XT  -AVXT   ■*  0 
n,a     p       n,a     p 

as  n-*-00  uniformly  in  a. 

(iii)   e|vL   (0 )|      SA«*>  for  some   r>l. 
t     p 

If  (f)(m)  or  a(m)  is  of  size  r/(r-l),  then  (3.13)  holds. 

Condition  3.14(i)  is  the  familiar  condition  that  the  vector  of 
likelihood  equations,  when  evaluated  at  the  true  parameter  vector  0  , 
has  zero  expectation.   Sufficient  conditions  for  3.14(i)  are  (1)  the 
model  is  correctly  specified,  and  (2)  the  density  of  (Q  ,l(Ap   .>0),x  ) 
is  sufficiently  regular  to  permit  differentiation  under  the  integral 
sign. 

Condition  3.14(ii)  is  somewhat  restrictive,  but  unfortunately  a 
less  restrictive  replacement  for  it  is  currently  not  available. 
Condition  3.14(ii)  restricts  the  heterogeneity  of  VL  (0  )  by  requiring 
it  to  be  covariance  stationary  asymptotically. 
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Condition  3.14(iii)  is  a  moment  condition  which  depends  on  the 
amount  of  dependence  the  sequence  (Q  , l(Ap   >0),x  )  exhibits.   If  the 
sequence  is  serially  independent,  then  r  can  be  set  arbitrarily  close  to 
one;  as  the  amount  of  dependence  increases,  as  measured  by  ij>(m)  or  a(m), 
r  increases  accordingly. 

3.4  Consistent  Covariance  Estimation 

We  consider  the  problem  of  deriving  consistent  estimators  for  the 
asymptotic  covariance  matrix  of  the  partial-MLE  GT  .   The  expression  for 
the  asymptotic  covariance  matrix  is 

nVl  (0°)_1V  (0°)V2L  (90)"1, 
n  p    n  p    n  p 

where  V  (0°)=  var(n~*VL  (0°)),  and  V  L  (0°)  is  the  matrix  of  second 
n  p  n  p  n  p 

order  partial  derivatives  of  L  ( 0  )  =  E(L  (0  )). 
e  n  p       n  p 

First  consider  the  problem  of  consistently  estimating  the  term 

nV  L  (0°)~  .   The  functional  form  of  this  term  does  not  depend  on  the 
n  p 

serial  dependence  (or  independence)  of  the  observations,  and  therefore 
consistent  estimation  of  it  is  straightforward.   The  following  theorem, 
which  combines  Lemma  2.6  of  White  (1980)  with  Theorem  2.3  of  White  and 
Domowitz  (1984),  provides  conditions  that  imply 

plim  n(V2L  (eT1)-1  -  V2L  (00)'1)  =  0. 
v  n  n         n  p 

Theorem  3.15.   Let  q  (y  ,0)  be  measurable  for  each  0  belonging  to  a 
compact  set  5,  and  continuous  on  E  uniformly  in  t  a.e. 

Suppose 
(i)  The  sequence  (y  )  is  mixing  as  stated  in  Definition  3.1. 
(ii)  For  rSl  and  any  d>0, 
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sup  E  |q  (y  ,  9)  |    SA<°° 
0e5 

-1   n  — 

If  plim  9=0°,  then  plim  n  (  E  (q  (y  ,  9  )-q  (y  ,  9°)) )=0 
n.  . -   t   t  xi   t   l 


Next  consider  the  problem  of  consistently  estimating  V  (0  ). 
r  n  p 

Unlike  the  term  V  L  (9°),  the  functional  form  of  V  (9°)  depends  on  the 
n  p  n  p 

nature  of  the  serial  correlation,  and  consequently  special  care  must  be 

taken.   The  general  form  for  V  ( 9  )  is 
&  n  p 


V  (9°);c(n))  =  n  1    E  E(f  (0°)f  ( 0°)T)  + 
n  p  j    t   p   t  p 

,  c(n)-l  n         .       t   •     r,   '        ^   t 
s_l  t_s+l 


where  f   (0  )  HVlog  f   (0  ),   f^(.)    is   the  density  of   (Q.,K  Ap.    ,  >0)  ,x   )   and 

t    p  t    p       L  t        LT±         L 

c(n)  is  such  that  E(f  (0°)f^   (0°)T  =  0  for  s>c(n).   The  natural  choice 
t  p  t-s  p 

for  an  estimator  of  V  ( 0  ;c(n))  is  the  sample  analogue  V  (0   ;  c(n)). 
n  p  n  n 

The  consistency  of  such  an  estimator,  however,  depends  on  the  asymptotic 
behavior  of  c(n).  We  will  consider  two  special  cases. 


Case  1  ♦   c(n)  =  c<<»  and  c<n-l. 

If  c(n)  is  equal  to  a  known  finite  constant  c  which  is  less  than  or 

equal  to  the  sample  size  minus  one,  (if  c=n,  then  the  estimator 

V  (0m  ;c)=0),  then  imposing  the  conditions  of  Theorem  3.15  will  suffice 
n  n 

for 


plim  (V  (0;c)  -  V  (0°;c))=O. 
c  n  n        n  p 
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An  example  of  sampling  situation  where  c  is  a  known  finite  constant 
would  be  one  in  which  the  observations  are  known  to  be  generated  from  a 
moving-average  process  of  order  c. 

If  c  is  assumed  to  be  constant  and  less  than  or  equal  to  n-1,  but 
otherwise  unknown,  then  the  problem  becomes  more  complicated.   Let  c 
denote  the  specified  choice  for  an  unknown  c.   In  the  next  section  we 

derive  an  asymptotic  test  for  the  hypothesis  c=c.   The  test  is  a 

f\j 
possible  criterion  for  specifying  c.   The  issues  involved  in  specifying 

r\,  %  ,   Jill  *\*. 

c  are  the  following.   If  we  specify  c<c,  then  the  estimator  V  ( GT  ;c) 

is  inconsistent  since  nonzero  terms  in  V  (6  ;c)  are  mistakenly 

n  p 

constrained  to  be  zero.   On  the  other  hand,  if  we  specify  c>  c,  then  the 

estimator  is  consistent,  but  inefficient  since  restrictions  of  the  form 

E(f .(G°)f  .(0°)  )=0  are  neglected.   When  the  purpose  of  estimating 
i  P  J  P 

V  (9°;c)  is  to  construct  asymptotic  test  statistics,  however,  the 
n  p 

essential  requirement  is  consistency  (rather  than  efficiency). 

0/ 
Therefore,  when  the  purpose  is  hypothesis  testing,  the  choice  c>c  is 

\, 
preferable  to  c<c. 

Case  2.  lim  c(n)=">  and  lim  E(f  (0°)f   ,  ,(0°)T)=O. 
/  N   t  p  t-c(.n;   p 

In  this  case  the  sequence  (f  (0  ))  is  only  assumed  to  be 
asymptotically  uncorrelated.  A  sufficient  condition  for  (f  (0 ))  to  be 
asymptotically  uncorrelated  is  that  the  sequence  (Q.}l(Ap  +,>0)  ,x  )  be 
mixing.   Theoretical  results  for  this  case  have  been  presented  by  White 
and  Domowitz  (1984),  White  (1984),  and  Newey  and  West  (1985).   Their 
results  depend  on  restricting  the  growth  rate  of  c(n).   Unfortunately, 
their  results  do  not  give  any  guidance  concerning  the  choice  of  c(n)  for 
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finite  samples.   The  following  theorem  is  due  to  Newey  and  West  (1985), 

and  provides  sufficient  conditions  for  plim  (V  ( 0  ;c(n))  -  V  (0  ;c(n))) 
r  n  n  n  p 

=0. 

Theorem  3.16.   Suppose 

(i)   f  (0  )  is  measurable  in  (Q  . 1( Ap_  >0) ,x  )  for  each  0  ,  and 
t  p  t     t+i     t  p 

continuously  differentiable  in  0  in  a  neighborhood  N  of  0  . 

(ii)   (a)  sup  |f  (0  )  |2SA<». 

0  eN  fc  P 
P 

(b)  There  are  finite  constants  d>0  and  r^l  such  that 

E|f  (0°)|4(r+d)^<oo. 
1  t  p 

(iii)  (Q  ,l(Ap  .>0),x  )  is  a  mixing  sequence  with  <j>(m)  of  size  2  or 

o(m)  of  size  2(r+d)/(r+d-l) ,  r>l. 

(iv)   For  all  t,  E(f  (0°))=O,  and  n'CG™1-©0)  is  bounded  in  probability, 
t  p  up 

i 
If  lim  c(n)=co  such  that  c(n)=o(n4),  then 

plim  (V  (ePScCiO)  -  V  (0°);c(n)))=0. 
n  n  n  p 

One  additional  problem  is  that  for  c(n)>l  the  estimate  V  (0~  ;c(n)) 
r  n  n 

is  not  necessarily  positive  semi-definite.   This  can  lead  to  negative 
estimates  of  the  variances  and  test  statistics  which  are  clearly  not 

acceptable.   To  ensure  that  V  ( 6)  ;c(n))  is  positive  definite,  the 

r  n  n 

summands  can  be  weighted  according  to  a  procedure  described  in  Newey  and 
West  (1985).   This  modification  does  not  affect  the  consistency  of  the 
estimate. 
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3.5  An  Asymptotic  Test  for  Serial  Correlation 

In  this  section  we  propose  a  test  sensitive  to  serial  correlation 
in  the  gradient  vectors  f  (0  ).   The  test  provides  a  criterion  for 
specifying  the  constant  c  of  the  covariance  estimator  V  ( 0~  ;c). 

The  null  hypothesis  of  interest  is 


H  :  E(f  (0°).f   (0°).)=O  for  all  i,j, 
o     t  p  l  t-c  p  j 

where  f  (0°).  denotes  the  i-th  component  of  the  vector  f  (0  ).   The 
t  p  l  t  p 

basis  for  a  test  of  H  comes  from  two  observations, 
o 

(1)  Under  H  ,  linear  combinations  of  the  components  of  the  vector 

o 

f  (0°)  are  uncorrelated  with  linear  combinations  of  the  components 
t  P 

of  the  vector  f   (0  ). 
t-c  p 

(2)  Under  H  ,  the  products  f  (G^1).^   (0°  ).  should  be  close  to  zero 

o  t  n  l  t-c  n  j 

for  sufficiently  large  n. 


Therefore,  a  reasonable  strategy  for  testing  H  would  be  to  compute  the 

sample  correlation  between  appropriate  linear  combinations,  and  reject 

H  if  the  sample  correlation  is  too  large  in  some  sense.   To  this  end, 
o 

for  a  k-dimensional  vector  f  (0°),  consider  the  artificial  regression 

t  P 

k  k    .       . 

E  w.  f  ((f1).  =  Z  a.f\  (eP1)., 
,  .  it  t  n  i  .  ,  i  t-c  n  i' 
i=l  l-l 


where  the  w.   are  known  constants  such  that  Z  w.  -1,  and  the  a.  are 
it  .=1  it  i 

unknowns  to  be  estimated.   The  test  we  propose  entails  computing  the  OLS 
estimates  a.S,  1-1,..., k,  and  testing  the  hypothesis  a^=    ...0^=0.  More 
formally,  we  have  the  following  theorem. 
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Theorem  3.17.   Define  a  =  (a  ...a  ), 


f  (0  )  =/f,(e  ). 

-c  p   /  1  p  1 


M9  \ 
1   p  k 


vf   (0  h 
\   n-c  p  1 


f   (0  ),    (n-c)xk 
n-c  p  k 


f(0  )  /  E  w.   ..f  ,.(9  ). 
p  /  i=1  i,c+l  c+1   p  i 


I   w.   f  (0  ). 
i=1  i,n  n  pi 


(n-c)xl 


In  addition  to  H  suppose 

(i)   The  vector-valued  function  f(0  )  is  continuously  differentiable 

k 
(component  by  component)  on  an  open  convex  set  5  CR  containing 

0°. 

P 

(ii)  There  exists  an  open  neighborhood  of  0  ,N,  such  that 


sup     f    (0   )i  <A<°°     and 

6   eN     t     P       " 
P 


sup     3f   (0   )i/90     <A'<° 

G  bl   t     P     P  _ 
P 


(iii)  plim  n  l   2  f  (0°)=O 
.  _,  t   p 


t  =  l 


(iv)   ,  ■  A  (0°)=n  1   f^_(0°)Tf_„(0°),  and  An(0  )  =  ECAJ0  )).   Then 


n  p 


-c  p   -c  p 


there  exists  an  open  neighborhood  of  0  ,  N° ,  such  that  A  (0  )  is 

p  n  p 
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positive  definite  on  N°  for  all  n  sufficiently  large  and 

plim  sup  lA  (0  )  -  A  (0  )  I  =0. 

0  jj°  n  P     n  P 
P 

(v)   Let  U  (0°)  =  var  (n_if   (  0°)Tf  (  ©°)  )  ,  and  let  U  (gT1)  denote  the 
n  p  -c  p     p  n  n 

sample  analogue.   Then  U  (0°)  is  positive  definite  on  an  open 
neighborhood  of  0  for  all  n  sufficiently  large  and 

plim  (u  (eF1)  -  U  (0°))  =  0. 
n  n      n  p 

A 

(vi)  Under  H  ,  U  "*(eP)n"*f   (0°)Tf(0°)  ^N(0,I). 
o'   n    p     -c  p     p 

Let  D  (0  )=A_1(0  )U  (0  )A-1(0  ).   Then  given  conditions  (i)-(vi),  and 
np   n   pnpn   p 

o  T 


Is  „-l,  rial-.      Is  A  2 
n  a   D   (GT  )  a   -t  v  • 


n   n   n 


xK. 


3.6  Summary  and  Conclusions 

The  main  points  of  this  chapter  are  the  following: 

(1)  The  assumptions  presented  in  sections  3.2  and  3.3  imply  that  the 
partial-MLE  of  the  disequilibrium  model  is  consistent  and  asymptotically 
normal.   The  assumptions  allow  for  serial  correlation  of  an  unknown 
form;  for  example,  an  arbitrary  ARMA  process  is  allowable  for  the 
observations.  At  the  same  time,  the  estimator  G>   is  computed  as  though 
the  observations  were  serially  independent,  and  thus  computational 
tractability  is  retained. 

(2)  To  calculate  asymptotic  test  statistics,  a  consistent  estimate  of 

the  asymptotic  covariance  matrix  is  needed.   Obtaining  a  consistent 

covariance  estimator  is  complicated  by  the  need  to  specify  a  constant  c 

such  that  E(f  (0°)f   (0°)  )=0  for  all  s^c.   In  general,  c  is  unknown  but 
t  p  t-s  p 

consistent  covariance  estimation  depends  on  specifying  a  c  such  that  c_>c. 

(3)  The  test  statistic  presented  in  Section  3.5  permits  a  test  of 

H  :c=c,  and  thus  provides  a  criterion  for  specifying  c. 
o 
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NOTES 
1. 


Our  discussion  of  mixing  draws  heavily  on  White  and  Domowitz 
(1984),  and  White  (1984,  pp.  43-47). 

Theorem  3.2  is  a  less  general  version  of  the  law  of  large  numbers 
presented  by  McLeish  (1975,  Theorem  2.10).   The  version  we  present  is 
discussed  in  White  (1984,  Corollary  3.48),  and  imposes  a  stronger  but 
simpler  moment  restriction. 

White  and  Domowitz  (1984)  extend  Hoadley's  Theorem  A. 5,  which  is  a 
uniform  law  of  large  numbers,  to  mixing  sequences  by  applying  Theorem 
2.10  of  McLeish  (1975)  instead  of  Markov's  law  of  large  numbers.   Here 
we  merely  point  out  that  Hoadley's  Theorem  1  can  be  extended  to  mixing 
sequences  using  the  same  technique. 

In  some  respects  the  conditions  of  Theorem  3.7  are  stronger  than 
those  stated  in  Hoadley's  Theorem  1.   For  example,  the  requirement  that 
f  (y  |0)  is  continuous  can  be  replaced  by  upper  semi-continuity.   The 
conditions  that  we  state  are  sufficiently  general  for  our  purposes. 


CHAPTER  4 
AN  EMPIRICAL  EXAMPLE:   THE  U.S.  COMMERCIAL  LOAN  MARKET 


4.1  Introduction 

In  this  chapter  the  disequilibrium  model  described  in  section  2.2 
(page  12)  is  fitted  to  monthly  data  on  the  U.S.  commercial  loan  market 
from  1979  to  1984.   The  problem  is  to  analyze  disequilibrium  supply  and 
demand  behavior  with  limited  a  priori  information  imposed  on  the  price 
adjustment  process.   The  model  is  estimated  and  tested  with  the 
partial-MLE  and  least  squares  method  described  in  sections  2.2  and  2.3, 
respectively.   The  possibility  of  serial  correlation  is  accounted  for 
using  methods  described  in  Chapter  three. 

Disequilibrium  models  of  commercial  loan  markets  have  been 
estimated  by  Laffont  and  Garcia  (1977),  Sealy  (1979),  and  Ito  and  Ueda 
(1981).   To  design  the  specification  of  the  supply  and  demand  equations 
these  works  were  consulted.   Our  model  and  estimation  methods,  however, 
differ  from  the  previous  studies  in  three  important  respects.   First, 
price  enters  the  model  differently.   Laffont  and  Garcia,  and  Ito  and 
Ueda  constrained  the  price  change  to  separate  the  sample,  and  Sealy 
assumed  that  price  changes  were  a  linear  function  of  normal  random 
variables.   Second,  the  starting  values  we  employ  for  maximizing  the 
likelihood  function  are  consistent  estimates,  and  therefore  ensure 
convergence  to  an  asymptotically  desirable  solution.   None  of  the  above 
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studies  employed  methods  that  guarantee  this.   Third,  we  will  adopt  the 
nonparametric  approach  developed  in  Chapter  three  to  allow  for  the 
possibility  of  serial  correlation.   Given  that  the  data  is  a  time 
series,  allowing  for  serial  correlation  is  particularly  important. 
Failure  to  do  so  can  cause  inconsistent  covariance  estimates  and 
therefore  misleading  test  statistics.   In  contrast,  most  existing 
disequilibrium  studies,  including  those  mentioned  above,  apply  methods 
to  time  series  data  that  are  only  appropriate  for  serially  independent 
observations.   The  nonparametric  approach  was  chosen  for  its  generality, 
and  computational  ease.  An  arbitrary  ARMA  process  is  allowable  for  the 
error  terms,  but  at  the  same  time  the  parameter  estimators  are  computed 
as  though  the  errors  are  serially  independent.   As  opposed  to  an 
assumption  of  serial  independence,  the  only  part  of  the  problem  that 
changes  is  the  calculation  of  the  asymptotic  covariance  estimate. 

4.2  The  Empirical  Model 

The  empirical  model  to  be  estimated  and  tested  is  specified  as 
follows . 

Dt  =  B10  +  6ll(RLt-RAt)  +  612IPt-l  +  Elt' 
St  =  32Q  +  S21(RLt-RTt)  +  B22TDt  +  e^, 

Qt  =  min(Dt,St), 


Pn>P10,  where  pn  =  Pr(  ARLt+1>0  |Dt>St)  ,  and 
p10  .Pr(ARLt+1>0|Dt<St). 


The  variables  we  use  will  differ  little  from  those  of  the  previous 
studies.   The  variable  RL  is  the  average  prime  rate  charged  by  banks;  RA 
is  the  Aaa  corporate  bond  rate,  and  reflects  the  price  of  alternative 
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financing  to  firms;  IP  is  the  industrial  production  index  and  measures 

firms  expectations  about  future  economic  activity;  RT  is  the  three  month 

treasury  bill  rate,  and  represents  an  alternative  rate  of  return  for 

banks;  TD  is  total  bank  deposits  in  billions  of  dollars,  and  is  a  scale 

variable.   The  observed  quantity  transacted,  Q,  is  specified  as  the  sum 

of  commercial  and  industrial  loans,  and  the  relevant  price  change  is 

ARL   =RL  ,-RL  .  All  interest  rates  are  expressed  as  percentages.   The 
t+1   t+1   t 

sample  consists  of  72  observations  on  each  variable,  and  can  be  found  in 
various  issues  of  the  Federal  Reserve  Bulletin. 

4.3  Hypothesis  Testing  Procedures 

Two  hypotheses  concerning  the  price  adjustment  process,  and  several 
hypotheses  concerning  serial  correlation  were  tested.   The  first  price 
adjustment  hypothesis  maintains  that  the  direction  of  the  price  change 
l(Ap   >0)  can  be  used  to  separate  the  sample  into  the  underlying  supply 
(Q  =S  )  and  demand  (Q  =D  )  regimes.   The  approach  we  have  chosen  to 
model  price  adjustment  permits  the  known  sample  separation  hypothesis  to 
be  conveniently  expressed  as 

V  Cpn,P10)-(i.o). 

The  null  hypothesis  was  tested  by  computing  a  Lagrange  multiplier  (LM) 
test.   The  LM  test  was  chosen  over  the  Wald  and  likelihood  ratio  tests 
because  it  only  requires  the  estimates  under  the  computationally 
simpler  null  hypothesis. 

The  second  price  adjustment  hypothesis  maintains  that  price 
adjustments  are  symmetrical  in  the  following  sense:   the  chance  of  a 
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price  increase  during  a  shortage  is  the  same  as  that  of  a  decrease 
during  a  surplus.   This  hypothesis  can  be  expressed  as 

V  pn=1_pio- 

To  test  the  hypothesis  of  symmetrical  price  adjustment,  a  Wald  test  was 
computed.   The  Wald  test  was  chosen  over  the  LM  and  likelihood  ratio 
tests  because  it  only  requires  the  unconstrained  estimates.   In  this 
case  the  constrained  estimates  (those  obtained  under  H  )  offer  no 
computational  advantage  over  the  unconstrained  estimates. 

The  LM  and  Wald  test  statistics  converge  to  their  usual  chi-squared 
limiting  distributions  provided  that: 


(1)  n  Vf(0°)VL  (0O)Vo,I); 

n   p   n  p 

(2)  a  constant  c  is  chosen  such  that  plim  (V  ( GT  ;c)-V  (9 ;c))=0. 

If  VL  (0°)  is  a  k-dimensional  vector,  and  both  (1)  and  (2)  hold,  then  we 
n  p 

can  conclude 

vl  (e°)Tv  (ef1;^)-1^  (e°)V. 

n  p   n  n       n  p   ^c 

(See,  for  example,  White  (1984,  Theorem  4.30)). 

The  specification  of  c  was  handled  as  follows.   The  LM  statistic 

for  the  first  H  and  the  Wald  statistic  for  the  second  H  were  each 
o  o 

% 
computed  for  several  successive  values  of  c.   The  LM  statistic  was 

computed  for  c=l,...,12,  and  in  each  case  the  null  hypothesis 
(p   ,p  n)=(l,0)  was  rejected.   The  Wald  statistic,  however,  produced 
conflicting  evidence  for  the  hypothesis  Pii=1"P1o'  for  Some  values  of  c 
the  hypothesis  was  rejected,  and  for  others  it  was  accepted.   To  choose 
among  the  conflicting  evidence,  the  test  statistic  for  serial 
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correlation  (See  Section  3.6)  was  computed  for  several  values  of  c.   On 

this  basis  c  was  specified,  and  a  single  covariance  estimate  for  the 

Wald  test  was  chosen.   The  covariance  estimate  chosen  for  the  Wald  test 

was  also  used  to  compute  the  asymptotic  standard  errors  of  the  parameter 

estimates. 

The  test  statistic  for  serial  correlation  depends  on  the 

correlation  between  linear  combinations  of  the  components  of  f  (0 )  and 

linear  combinations  of  the  components  of  f   (0  ).   Therefore,  the 

r  t-c  p 

conclusion  of  the  test  depends  on  how  the  linear  combinations  are 
chosen,  or  in  other  words,  on  the  specified  weights  w   (see  page  36). 
For  example,  the  test  might  reject  H  for  some  set  of  weights,  and  not 
reject  H  for  other  sets.  To  help  cope  with  this  difficulty,  it  was 
decided  to  choose  the  weights  randomly  from  a  uniform  distribution  on 
the  interval  (0,1).   If  there  is  a  finite  or  countable  number  of  sets  of 
weights  such  that  H  is  incorrectly  rejected  or  accepted,  then  choosing 
the  weights  from  a  continuous  distribution  ensures  that  these  weights 
are  not  chosen  with  probability  one.   The  weights  were  generated  from  a 
uniform  distribution  by  a  SAS  random  number  generator. 

4.4  The  Results 

The  model  was  estimated  under  the  assumption  that  the  error  terms 
are  independent  normal  variates  with  constant  variances,  but  are  not 
necessarily  serially  independent.   First  the  LS  method  was  applied. 
The  LS  estimates  are  reported  in  the  first  column  of  Table  1 ,  and  were 
used  as  starting  values  to  obtain  the  ML  estimates  presented  in  the 
second  and  third  columns.  A  computer  program  was  written  with  the  SAS 
"Matrix  Procedure"  for  the  purpose  of  maximizing  the  likelihood 


£6 


functions;  the  program  uses  the  quadratic  hill-climbing  technique  as 
presented  in  Goldfeld,  Quandt,  and  Trotter  (1966).   In  Appendix  A. 4  we 
describe  the  quadratic  hill-climbing  technique,  and  show  that  consistent 
initial  estimates  ensure  that  the  second-round  estimates  obtained  from 
the  technique  have  the  same  asymptotic  distribution  as  the  partial-MLE. 

The  estimates  in  column  two  of  Table  1  maximize  the  likelihood 
subject  to  (p   ,p  n)=(l,0),  or  equivalently,  under  the  assumption  that 
the  direction  of  the  price  change  separates  the  sample  into  the 
underlying  supply  and  demand  regimes.   Unlike  previous  studies  a  test  of 
this  hypothesis  was  carried  out.   The  constrained  estimates  were  used  to 
construct  the  Lagrange  multiplier  (LM)  statistics.   The  LM  statistic  was 
computed  with  twelve  different  covariance  estimates  (c=l , . . . ,12) .  As 
the  figures  in  Table  two  indicate,  the  hypothesis  of  known  sample 
separation  is  rejected.   The  conclusion  of  the  LM  test  has  two  important 
implications  for  the  analysis  of  the  data  and  model.   First  it  suggests 
that  the  price  change  alone  should  not  be  used  to  determine  whether  the 
sample  period  was  characterized  by  excess  demand,  excess  supply,  or 
both.   In  most  disequilibrium  studies  this  type  of  analysis  is  routinely 
done.   Second,  as  was  shown  in  Section  2.1,  incorrect  sample  separation 
adversely  effects  the  large  sample  properties  of  the  estimators.   In 
view  of  this  problem  the  constrained  estimates  are  suspect. 

The  next  estimation  was  performed  over  the  unconstrained  space,  and 
consequently  p. ,  and  p]f)  were  estimated  along  with  the  other  parameters. 
In  this  case  all  of  the  initial  consistent  estimates  were  employed,  and 
therefore  the  estimates  in  column  three  represent  the  consistent  and 
asymptotically  normal  solution.   The  ML  estimates  are  not  much  different 
than  the  LS  estimates.   This  is  due  to  stopping  iteration  before 
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complete  convergence  to  a  maxima  of  the  likelihood  function.   The 
iterative  technique  performed  poorly  for  the  unconstrained  likelihood  in 
the  sense  that  the  speed  of  convergence  was  extremely  slow.   For  this 
reason,  the  final  estimates  were  obtained  from  the  100th  iteration  where 
the  gradient  is  not  significantly  close  to  zero,  and  therefore  are  not 
true  ML  estimates.   However,  since  the  initial  estimates  are  consistent, 
estimates  obtained  after  the  second  iteration  are  asymptotically 
equivalent  to  the  ML  estimates,  and  therefore  nothing  is  lost  by 
stopping  iteration  before  convergence,  at  least  asymptotically.   Further 
details  regarding  this  point  are  provided  in  Appendix  A. 4.   The 
particular  specification  chosen  for  the  model  performed  well  in  the 
sense  that  all  of  the  estimates  are  of  the  correct  sign,  and  most  are 
significant.   The  estimates  of  p.,  and  p  Q  are  .8179  and  .2455, 
respectively,  which  mean  there  is  (1)  a  81.79%  chance  of  a  price 
increase  and  18.21%  of  a  decrease  during  shortages,  and  (2)  a  75.45% 
chance  of  a  decrease  and  24.55%  of  an  increase  during  surpluses. 

To  select  a  covariance  estimator  for  the  Wald  test  of  H0:Pi i=l~Pi n' 
the  serial  correlation  statistic  was  computed  for  c=l,2,3.   (See  Table 
3.)  The  hypothesis  of  c=3  was  accepted.   The  Wald  test  statistic  did 
not  reject  the  hypothesis  H  :p  =l-p  Q  (see  Table  4),  suggesting  that 
price  adjustments  are  symmetrical. 

The  differences  which  arise  when  the  imperfect  sample  separation 
given  by  the  price  change  is  ignored  can  be  seen  by  comparing  columns 
two  and  three  of  Table  1.  While  both  sets  of  estimates  give  the  correct 
signs  for  the  supply  and  demand  variables,  the  unconstrained  estimates 
suggest  that  demand  and  supply  are  less  responsive  to  price  changes  than 
do  the  constrained  estimates.   The  unconstrained  estimate  of  the  price 
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parameter  for  the  supply  equation  is  approximately  40%  less  than  the 
constrained  estimate,  and  the  price  coefficient  for  the  demand  equation 
is  approximately  14%  smaller  in  absolute  value  for  the  unconstrained 
estimate.   Given  the  rejection  of  the  known  sample  separation  model, 
however,  we  are  more  inclined  to  believe  the  unconstrained  estimates. 

The  problem  of  determining  whether  the  period  1979-84  was  charac- 
terized by  excess  demand  or  supply  was  also  addressed  with  the 
unconstrained  estimates.   This  was  accomplished  by  estimating  the 
probability  of  excess  demand  for  each  t  conditional  on  the  quantity 
transacted  and  the  direction  of  the  price  change.   The  expression  for 
this  conditional  probability  is 


l(Ap   X))  l(Ap   <0) 

PrCDt>St|Qt,l(Apt+1>0))  =  (png  )  .CCl-pn)8at)     ' 


ft(Qt,KApt+1>0)) 


The  results  are  reported  in  Table  5.  As  pointed  out  by  Lee  and  Porter 
(1984),  the  classification  rule:   Q=S  if  Pr(D  >S  |Q  ,  l(Ap  ,>0))  >  .5 
and  Q  =D  otherwise,  is  optimal  in  the  sense  that  it  minimizes  the 
probability  of  misclassification.   Applying  this  rule,  we  find  that 
54.12%  of  the  observations  are  excess  demand  and  45.8%  excess  supply. 
In  contrast,  if  one  were  to  rely  solely  on  the  direction  of  the  price 
change,  the  conclusion  would  be  31.9%  excess  demand,  43.1%  excess 
supply,  and  for  25%  of  the  observations,  Ap   =0.   In  Table  6,  the 
compatibility  of  the  direction  of  the  price  change  with  the  optimal 
classification  rule  is  further  examined.   Comparing  the  two  rules, 
excluding  the  observations  for  which  Ap   =0,  we  find  that  9 
observations  out  of  54  are  classified  differently. 
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Table  1 


Estimated  parameters  and  statistics. 
(Asymptotic  standard  errors  in  parentheses) 


Variables 


LS 
Initial  estimates 


MLE 
PU-1,  Pl0=0 


MLE 
)   ,plf)  unconstrained 


demand  const. 
RL-RA 

IP-1 
2 

f.l 

supply  const. 
RL-RT 

TD 

2 

ae2 

Pll 

P10 

log  likelihood 

n=72 


79.6508  40.5262  79.6509  (169.61) 

-14.9764  -17.2779  -14.9758  (   2.918) 

2.2856  2.5429  2.2938  (   1.170) 

367.7335  2140.5700  367.7344  (  94.36) 

-60.6708  -74.9844  -60.6709  (145.87) 

4.4981  7.3034  4.4985  (  0.3834) 

0.3176  0.3266  0.3288  (  0.982) 

1197.7623  77.4408  1197.7622  (  87.40) 

0.8526  1.0000  0.8178  (   .0673) 

0.2571  0.0000  0.2454  (    .2752) 

-355.9850  -317.3710 
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Table  2 


Test  of  H  :(p119pin)  =  (1,0) 
o    1110 


v  (cfV) 

n  n 

LM  Statistic 

c=l 

16.7693 

=2 

28.8588 

=3 

18.9718 

=4 

65.5703 

=5 

22.9532 

=6 

17.3450 

=  7 

10.7834 

=8 

12.2467 

=9 

14.4707 

=  10 

5.7286 

=11 

6.5377 

=12 

6.5118 

H  rejected  at  a%  level 


0.020% 
0.001% 
0.008% 
0.001% 
0.001% 
0.017% 
0.455% 
0.219% 
0.072% 
5.702% 
3.805% 
3.854% 
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Table  3 


Test  of  H  :  c=c 
o 

Serial  Correlation  Statistic       H  rejected  at  a%  level 

o 


1  32.7552  0.030% 

2  14.2697  16.104% 

3  10.6509  38.540% 


Table  4 


Test  of  HQ:  pn  =  l-p1() 

V  (eP;c)      Wald  Statistic               H  rejected  at  a%  level 
n  n o 

c=3  0.0550411  94.34% 
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Table  6 


Compatibility  of  the  Direction  of  the  Price  Change  with  the 
Optimal  Classification  Rule 


Pr(Dt >St  |Qt, K  Apt+1  X))) >.5        Pr(Dt >St  |Qt> 1( Apt+1  X))) <.5 


Apt+1X) 

20 

3 

Apt+1<0 

6 

25 

Apt+1=0 

13 

5 

CHAPTER  5 

SEMI PARAMETRIC  ESTIMATION  OF  DISEQUILIBRIUM  MODELS  USING  THE 

METHOD  OF  MAXIMUM  SCORE 


5. 1  Introduction 

We  consider  an  alternative  estimation  strategy  not  previously 
analyzed  for  a  disequilibrium  model.   The  strategy  is  the  so-called 
"semiparametric"  estimation  developed  in  Manski  (1975),  Cosslett  (1983), 
Powell  (1984),  Manski  (1985),  and  some  others.   Semiparametric 
estimators  have  been  shown  to  be  consistent  under  more  general 
conditions  than  the  conventional  LS  and  ML  estimators,  and  therefore 
require  fewer  prior  restrictions.   For  a  number  of  cases  where 
consistent  LS  and  ML  estimation  require  the  functional  form  of  the  error 
distribution,  consistent  semiparametric  estimators  have  been  derived 
without  imposing  functional  form.   Powell  did  so  for  the  censored 
regression  model  using  the  method  of  least  absolute  deviations,  Cosslett 
derived  a  distribution-free  ML  estimator  for  the  binary  choice  model, 
and  Manski  derived  consistent  estimators  for  the  same  model  using  the 
method  of  maximum  score.   Semiparametric  estimation  is  most  useful  when 
parametric  assumptions  cannot  be  trusted,  but  are  needed  for  consistent 
LS  and  ML  estimation.   In  particular,  it  offers  an  improved  strategy  for 
estimating  disequilibrium  models. 

We  derive  consistent  semiparametric  estimators  for  disequilibrium 
models  using  the  method  of  maximum  score  of  Manski  (1975,  1985). 
Consistent  score  estimators  are  derived  for  the  following  situations: 
the  functional  forms  of  the  error  distributions  are  unknown,  the 
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quantity  transacted  is  an  unknown  function  of  supply  and  demand,  and  the 
price  change  is  an  unknown  function  of  excess  demand.   The  presentation 
comprises  three  models  and  their  score  estimators.   The  models  we 
consider  are  all  of  the  following  form: 

M.  (model):   Given  the  supply  and  demand  equations  st=^xt  +  e2t  an(* 
D  =  3?x  +  e.  ,  the  iid  sequence  of  random  vectors  (Q  ,p  ,x  )   . ,  the 

tltJ-t  LLC   U"TX 

event  S   involving  either  p.  or  Q  ,  and  the  event  S  involving  x  . 
pq  t      t  x  t 

Pr(S<q|s<!6°,B°>  >Pr(Spq|s^6°;e°); 

where  S°  denotes  the  complement  of  the  event  S.   General,  intuitive 

considerations  motivate  the  specification  of  (S   , S  )  for  each  model. 

pq   x 

For  example,  the  intuition  that  an  expected  shortage  (excess  demand)  is 
a  better  predictor  of  a  positive  price  change  than  an  expected  surplus 
motivates  the  model  in  Section  5.2.   Given  the  model,  consistent 
estimation  depends  on  general  continuity  and  identification  assumptions 
which  do  not  require  prior  knowledge  of  the  functional  forms  of  the 
underlying  distribution  functions  or  explicit  equations  for  quantity  or 
price. 

The  model  in  Section  5.2  concerns  events  involving  the  price 
change,  Ap    =  p  ,,-p  ,  and  expected  excess  demand,  3  x   =  ^ixt~^2xt' 
or  more  specifically,  the  binary  variables  l(Ap  , , >0)  and  l(3x  >0), 
where  1(0  denotes  the  indicator  function.   The  model  maintains  that 
given  l(3°x  >0),  the  best  forecast  of  l(Ap   .>0)  corresponds  to 
l(Ap   >0)  =  1(3  x  >0).   A  score  estimator  of  3  is  defined  and 
assumptions  for  consistency  given.   The  model  resembles  the  binary 
response  model  studied  by  Manski  (1975,  1985),  and  shares  an 
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identification  problem:   3°  is  only  identified  up  to  an  unknown 
multiplicative  scalar. 

The  model  in  Section  5.3  is  a  more  restrictive  version  of  that  in 
Section  5.2,  but  retains  a  considerable  amount  of  generality.   The  model 
is  designed  to  exploit  the  fully  observable  APt+1  (versus  l(Ap   >0))  to 
identify  3°.   A  consistent  score  estimator  is  presented,  and  we  show 
that  6°  is  identified  without  a  loss  of  scale.   The  model  represents  a 
completely  new  application  for  maximum  score  estimation  as  it  differs 
significantly  from  the  model  studied  by  Manski. 

The  estimators  presented  in  Section  5.2  and  5.3  do  not  depend  on 
the  quantity  transacted,  Q  ,  and  therefore  impose  no  restrictions  on  it. 
By  neglecting  the  observations  on  Q  ,  however,  the  generality  involves  a 
loss  of  information.   In  Section  5.4  we  specify  a  model  for  Q  ,  and 
define  a  corresponding  score  estimator.   The  specification,  however,  is 
insufficient  to  identify  3  (even  up  to  a  multiplicative  scalar)  without 
severely  restricting  the  distribution  of  x  .   To  eliminate  the 
identification  problem  the  models  of  the  previous  sections  are  added  to 
the  specification,  and  the  estimator  is  redefined.  The  resulting 
estimator  uses  the  entire  sample  (Q  , APt+i 'xt^t=i '  and  is  snown  to  be 
consistent  under  general  conditions. 

5.2  A  Directional  Model  and  Consistent  Estimation  Up  to  Scale 

The  directional  model  restricts  the  direction  of  the  price  change 
to  be  most  likely,  but  not  certain  to  follow  the  sign  of  expected  excess 
demand,  or  equivalently 

M5.1  (directional  model):   Pr( Ap   >0  |  3°xt >0)  >  Pr(  Apt+1<0  |  3°xt>0)  ,  and 
Pr (  Apt+1_<0  |  3°xt  <))  >  Pr(  Apt+1  >0  |  3°xt_<0)  . 


56 


The  motivation  for  M5.1  is  its  compatibility  with  an  intuitively 

appealing  forecast  procedure:   if  a  shortage  is  expected  at  time  t, 

3°x  >0,  then  predict  a  positive  price  change,  Ap   ,>0;  otherwise, 

predict  a  nonpositive  change.   Given  M5.1,  the  number  of  correct 

forecasts  must  eventually  exceed  the  number  incorrect. 

The  forecast  procedure  in  turn  motivates  a  strategy  for  estimating 

6°  from  n  observations  on  ( APt+i >xt) :   choose  as  an  estimate  of  3  a 

value  3  that  maximizes  the  proportion  of  the  observations  characterized 

bv  l(Ap   >0)  =  l(3°x  >0).   This  is  the  method  of  maximum  score.  We 
'     t+1  t 

propose  the  score  estimator: 

-1  n 
3  =  arg  max  g  (  3)  ,  where  g  (3)=n    E  g  (  3)  ,  and 

n       3eB  n  n        t=l 

g  (3)  =  l(Apt+1>0)l(3xt>0)  +  l(Apt+1<0)l(3xt<0). 

The  function  g  ( •)  "scores"  one  if  a  candidate  3  implies  a  forecast 
compatible  with  the  maintained  model,  M5.1,  and  zero  otherwise. 

Manski  (1985)  presents  a  consistent  score  estimator  for  a  model  of 
the  form  MED(y|x)=bx,  where  MED(z)  denotes  the  median  of  the  random 
variable  z.   His  consistency  proof,  however,  depends  on  the  weaker 
model:   Pr(y>0)  |bx>0)  >  Pr(yj<0  |bx>0)  and  Pr(y<0  |bx<0)  >  Pr(y>0)  |bx<0) . 
We  have  postulated  our  model  in  the  weaker  form  for  two  reasons.   First, 
the  weaker  model  is  easy  to  interpret  as  a  price  adjustment  model; 
positive  price  changes  occur  most  frequently  with  expected  shortages, 
and  negative  changes  with  expected  surpluses.   Second,  but  not  less 
important,  MED(Ap  .  |x  )  =  3  x  is  unnecessarily  restrictive. 

Manski 's  consistency  proof  (1985,  p.  323)  is  directly  applicable 
for  3  assuming  appropriate  regularity  conditions  are  met.   Theorem  5.2 
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below  provides  assumptions  that  imply  g  converge  to  6  almost 
everywhere  (a.e.)  as  n  becomes  indefinitely  large. 

Theorem  5.2.   In  addition  to  Ml . 1  assume: 

A5.3.  (continuity):   E(g  (8))  =  g( 6)  is  continuous  in  g  on  a  compact  set 

B. 

A5.4.  (identification):   The  set  A  (g)  =  {x:  sgn( g°x)  *  sgn( gx) }  has 

positive  probability  for  all  geB  such  that  g*g  . 

Then  lim  g  =g°  a.e. 
n 

Proof: 

Step  1.   Uniform  convergence. 

The  proof  of  uniform  convergence  uses  the  argument  presented  in  Manski 

(1985,  pp.  321-2).   Observe  that 

8n(3)    =  Pn(Apt+l>0'    ^t^   +  Pn(Apt+l-°'    ext-0)'    and 

ice)  =  p(Apt+1>o,  gxt>o)  +  p(Apt+1^o,  gxt<0), 

where  P  ,  P  represent  the  empirical  and  true  distributions.   Therefore, 
the  generalized  Glivenko-Cantelli  theorem  of  Rao  (1962,  Theorem  7.2) 
implies 

lim  sup|g  (g)  -  1(g)  |  =0  a.e. 
geB  n 

Step  2.   Identification. 

M5.1  and  A5.4  imply  that  g  uniquely  maximizes  g(g).   To  see  this, 

consider 

E(gt(g°)  -  gt(g))  =  /E(gt(g°)  -  gt(g)|xt)dFx 
A>) 
+  /  E(g  (g°)  -  g  (g)|x  )dF 

a  (sr  t       c 
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where  AC(3)  denotes  the  complement  of  A  ( 3) ,  and  F  the  distribution 
x  x         x 

function  of  x.   The  first  term  on  the  right-hand  side  vanishes  given  the 
definition  of  g  ,  and  under  M5.1  the  second  term  is  strictly  positive. 

y        o 

Step  3.   lim  3=3  a.e. 

v  n 

Given  A5.3,  Step  1,  and  Step  2,  a.e.  convergence  follows  from  Theorem  2 

of  Manski  (1983). 

Q.E.D. 

The  assumptions  permit  a  fairly  general  disequilibrium  model.   The 
consistency  proof  does  not  depend  on  the  distributions  of  e   and  e9  , 
or  how  the  market  determines  the  quantity  transacted.   Consistency 
depends  on  a  price  adjustment  model  which  enters  without  an  explicit 
adjustment  equation,  or  a  known  functional  form  for  the  probability 
distribution  of  prices.   It  suffices  to  believe  that  an  expected 
shortage  (surplus)  is  a  better  predictor  of  a  positive  (nonpositive) 
price  change  than  an  expected  surplus  (shortage). 

The  generality  of  the  assumptions,  however,  has  costs.   In 
particular,  a  careful  examination  of  A5.4  reveals  that  3  is  only 
identified  up  to  an  arbitrary  scale  factor.   The  identification  problem 
results  from  the  failure  of  the  obvious,  but  necessary  condition  that 
A  (3)  be  nonempty  for  all  3*3°.   Observe  that  for  any  A>0  we  have 
sgn(A3°x)  =  sgn(3°x)  for  all  vectors  x,  and  therefore  A  (A3  )  is  an 
empty  set.   Thus,  if  points  of  the  form  3= A3  are  included  in  the 
parameter  space,  B,  then  A5.4  fails  as  does  identification  (Step  2). 
Manski  (1985)  resolves  the  problem  by  normalizing  the  parameter  space 
with  respect  to  scale  which  effectively  eliminates  the  troublesome 

points.   Scale  normalization  suffices  for  A5.4,  but  the  conclusion  of 

%  o  2 

Theorem  5.2  becomes  lim  3  =  A3  a.e.,  where  A  is  an  unknown  scalar. 
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The  loss  of  scale  can  be  interpreted  as  arising  from  insufficient 
information.   The  directional  model  represents  prior  information  on  the 
stochastic  behavior  of  the  signs  of  Ap   ,  and  gx  ,  but  not  their 
magnitudes;  by  construction  the  estimator  depends  only  on  the  signs. 
The  limited  information  permits  a  fairly  general  model,  but  limits  what 
can  be  learned  about  8°.   We  shall  see  next  that  the  loss  of  scale  can 
be  eliminated  by  imposing  assumptions  on  the  magnitudes  of  Ap  ,  and 
gx  .   At  the  same  time  it  is  possible  to  retain  a  considerable  amount  of 
generality. 

5.3  A  Price  Adjustment  Model  with  g  Identified  (Without  a  Loss  of 
Scale) 

Manski  (1985)  discusses  the  score  estimator  for  a  binary  response 

model  where  the  dependent  variable,  y*,  is  unobservable,  and  the  sample 

consists  of  observations  on  l(y*>0).   In  the  last  section  the  price 

change  was  treated  analogously  to  obtain  a  robust  method  of  estimation. 

Unlike  the  problem  considered  by  Manski,  however,  Apt+1  is  generally 

observable.   To  take  advantage  of  the  extra  information,  and  thus  obtain 

a  stronger  result,  we  propose  the  following  model. 

M5.5  (directional-magnitude  model):   for  appropriately  specified  numbers 

e>0  and  6>0, 

Pr(Apt+1>e|B°xt>6)  >  max(Pr(  |  Apt+1  |  <£  |  6°xt  >6)  ,  Pr(  Apt+1  <-e  |  B°xt>6))  , 

Pr(  lApt+l  I-6'  lB°xJ-6)  >  max(Pr(APt+1>El  I  &°xt  L±fi> »  Pr(Apt+1<-e|  |S°xt  |<6)), 
Pr(Apt+1<-e|8°xt<-<5)  >  max(Pr(  Apt+1  >e  |  8°xt<-6)  ,  Pr(  |  Apt+1  |  <e  |  B°xt<-5))  . 


The  directional-magnitude  model  quantifies  the  notion  that  large  (small) 
discrepancies  between  expected  buy  and  sell  decisions  are  most  likely  to 
lead  to  relatively  large  (small)  price  changes.  The  model  predicts  a 
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small  price  change  ( | Ap  ,  \< e)  if  the  expected  market  position  lies 
within  a  specified  interval  centered  at  equilibrium  ( | 3  x  q6) ,  and 
larger  changes  (  | Ap  ,.  rO  otherwise. 

Compared  to  M5.1,  the  model  M5.5  is  more  restrictive  as  it 
restricts  both  the  direction  and  magnitude  of  the  price  change.  We 
shall  see,  however,  that  M5.5  distinguishes  3  from  A3  ,  and  thus  it 
becomes  meaningful  to  discuss  estimators  that  converge  unambiguously  to 
6°. 

Given  M5.5  we  define  a  score  estimator  of  3  as  follows: 

3  =  are  max  h  ( 3) ,  where 

3£B  n 

h  (6)  =  1(AP   >e)l(6xt>5)  +  l(|Apt+1he)l(|6xtN) 
+  l(Apt+1<-e)l(3xt<-6). 

To  prove  lim  3  =3  a.e.  using  the  arguments  in  the  proof  of  Theorem  5.2, 
n 

the  relevant  assumptions  are: 

A5.6  (continuity):   h(  3)  is  continuous  in  3  on  a  compact  set  B. 
A5.7  (identification):   The  set  J  (3)  =  (x:   sgn(3°x-<5)  *  sgn(3x-<S)} 
has  positive  probability  for  all  3eB  such  that  B^S  . 

The  important  difference  between  the  above  assumptions  and  those  of 

Section  5.2  lies  in  the  identification  assumptions  A5.4  and  A5.7. 

Specifically,  assumption  A5.7  does  not  require  a  normalized  parameter 

space  since  there  generally  exist  vectors  x  such  that  sgn(  3  x-<5)  * 

sgn(3x-<5)  for  3*3°;  i.e.,  the  set  J  ( 3)  is  nonempty  for  6*3  . 

Therefore,  it  is  possible  to  restrict  the  distribution  of  x  so  that 

J  (3)  has  positive  probability  for  3*3  ,  and  to  identify  3  without  a 
x 

loss  of  scale.  We  summarize  the  result  in  the  following  theorem. 
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Theorem  5.8.  Suppose  the  i-th  component  of  the  vector  8  is  nonzero. 

Then  for  all  3  such  that  8.*B?  and  B.*0,  the  set  J^C 3)  is  nonempty. 

Proof: 

It  suffices  to  show  that  there  exists  at  least  one  solution  x  to  the 

system  of  linear  equations:   M( 3  , 8)x=r  where 

M(B°,3)  =     |ej.. •£...£  \ 

T   ■   y  I    »  an(i  Y  ><5>Y  or  Y>(S^Y  • 

The  existence  of  x  is  equivalent  to  rank(M( 6  ,  8))  =  rank(M(8  ,3)  T)  ,  or 
det(M(3°,3))  =  det(M(3°,3)  D.   If  det  (M(3°,3))  =  2,  then  the  proof  is 
complete.   If  det(M(8°,8))  =  1,  then  we  need  B±/  6?  =  y/y  •   The 
existence  of  such  points  y   and  y  follows  immediately  since 

{y/Y°:   y>(S>Y0}  =  (— ,0)U(1,«0,  and 

{Y/y°:   Y°>«S>Y>  =  (— ,D. 
Q.E.D. 

5.4  Maximum  Score  Estimation  of  Models  That  Include  the  Quantity 

Transacted 

The  estimators  presented  in  sections  5.2  and  5.3  do  not  depend  on 
the  observed  quantity  transacted,  Q,  and  therefore  neglect  relevant 
sample  information.   In  this  section  we  propose  a  model  for  Q,  and 
define  a  score  estimator  of  8  that  depends  on  n  observations  of  Q.  We 
shall  see,  however,  that  the  model  for  Q  is  insufficient  to  identify  8 
(even  up  to  a  multiplicative  scalar).   We  resolve  the  identification 
problem  by  combining  the  model  for  Q  with  the  price  adjustment  models 
described  in  sections  5.2  and  5.3.   The  score  estimator  we  define  for 
the  combined  model  uses  the  entire  sample  (Q  , Ap  +1 »xt^t"l '  and 
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therefore  can  be  expected  to  be  more  efficient  than  the  estimators  of 
sections  5.2  and  5.3. 

The  observations  on  the  quantity  transacted  are  modeled  as  follows: 

M5.9  (quantity  model):  For  some  given  6>0, 

Pr(Qt>6|g°xt  >   6,    e°xt  >    6)    >  Pr(Qt<6  |  3°xt    >   6,    3°xt    >   6), 
and 

Pr(Qt^6|S°xt  <   6,    |3°xt  <   6)    >  Pr(Qt>6 1  B°xt    <6,    8°xt  _<   6), 

Two  appealing  assumptions   that  are   sufficient  for  M5.9,   and  therefore 
motivate   it,   are 

A5.10  Q     =  min(D    ,S    ). 

A5.ll  MEDCe  )  =  MED(e  )  =  0,  and  e   and  e„  are  independent. 

Assumption  A5.ll  requires  only  independent  error  terms  with 
distributions  symmetrical  about  zero. 

To  construct  an  estimator  of  3  given  the  quantity  model,  we  define 
the  scoring  function: 

qt(8)  =  l(Qt>6)l(31xt>6,  62xt><5)  +  l(Qtj<5)l(  B^S,  B^jS). 

To  prove  consistency  for  a  maximizer  of  q  ( B)  using  the  arguments  in  the 
proof  of  Theorem  5.2,  the  relevant  assumptions  are: 

A5.12  (continuity):   q( B)  is  continuous  in  3  on  a  compact  set  B. 
A5.13  (identification): 

(i)  The  set  U  (  3)  =  (x:  sgn(B°x-6)  ^sgn(61x-6),  sgnUlJx-S)  * 
sgn(3?x-<5)}  has  positive  probability  for  all  8eB  such  that  (Bj^)  * 
(3°,  3°). 
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(ii)  The  set  Z  (3°)  =  (x:   sgn(3?x-6)  *  sgn(3?x-6)}  has  zero 
x  1  l 

probability. 

The  role  of  assumption  A5.13  in  proving  consistency  is  analogous  to  that 
of  the  previous  identification  assumptions  A5.4  and  A5.7.   The  two  parts 
of  A5.13  imply  that  6°  uniquely  maximizes  q(B).   Part  (i)  compares  to 
the  familiar  order  condition  needed  for  the  identification  in  the 
textbook  simultaneous  equation  framework.   For  example,  if  the  supply 
and  demand  equations  have  no  explanatory  variables  in  common,  and  <$>0, 
then  Theorem  5.8  implies  that  U  ( 3)  is  nonempty  for  3*3  .   To  see  the 
role  of  part  (ii),  suppose  that  the  sets  ZCU  =    (Z^(  3°)   IM  3) },  Z°UC ,  ZU, 
and  ZU°  each  have  positive  probability  for  some  3*3  .   Then  we  can  write, 


E(qt(3°)-qt(3))  =  /E(qt(30)-qi;(3)  |xt)dF5 


Z  U 


+  /  E(q  (  3°)-q  (  3)  |xJdF   +  /  E(q.  (  f)-qA  8)  |x  )dF 

zcuc  t  t  t   x   ZU  t  t    t   J 


+  /  E(q  (3°)-q«.(3)  |x  )dF 

zuc  fc     t    t   > 


It  can  be  readily  verified  that  the  first  term  on  the  right  hand  side  is 
positive,  the  second  in  nonnegative,  the  third  is  zero,  and  the  last 
term  is  negative.   Therefore,  given  the  negativity  of  the  last  term, 
3*3°  does  not  necessarily  imply  E(q  ( ^)-q  (  3) )  >  0.   To  rule  out  this 
possibility,  we  impose  part  (ii) . 

The  requirement  that  Z  (  3°)  has  zero  probability,  however,  is  too 
restrictive  to  be  generally  applicable.   It  is  difficult  to  imagine  a 
situation  where  such  an  assumption  would  be  appropriate.   Therefore, 
unless  one  is  willing  to  severely  restrict  the  distribution  of  x  ,  the 
model  M5.9  is  insufficient  to  identify  3  .  Assumption  A5.13(ii)  can  be 
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relaxed,  however,  by  combining  the  model  for  Q  with  the  price  adjustment 
model  of  Section  5.2,  and  constructing  a  score  estimator  that  exploits 
both  models.   For  this  purpose  we  assume  that  the  price  adjustment  model 
M5.1  holds  in  addition  to  M5.9,  and  consider  the  scoring  function: 

q£(0,0°)  =  l(z£(B0))qt(6)  +  l(Zx(3°))Pt(8) 

where  P  (  0)   =  1(  Zip     1<))1(  ^x   <5,    B7xt>6)  +  1(  Ap        >0)1(  0^^,    82xt^6) , 
1(ZC(0°))    =  l(x   eZC(0°)),    and  ZC(0°)   denotes   the   complement   of  Z   (0°). 

X  t   X  X  X 

Generally  Z  (0  )  will  be  unknown,  but  if  a  consistent  estimate,  say  0  , 
x  n 

is  available,  then  it  can  be  replaced  by  Z  ( 0  ) .   One  possible  choice 

for  0  is  the  estimator  presented  in  Section  5.3.   This  forms  the  basis 
n 

for  a  "total"  sample  estimator  of  0  : 


0  =  arg  max  q*  (0,0  ) 
0SB   n    n 


To  show  that  0  converges  to  0  a.e.  we  prove: 


Theorem  5.14.   Let  lim  0  =0  a.e.,  and  0  eB  for  all  n.   In  addition  to 
n  n 

M5.1  and  M5.9  assume: 

(continuity):   q*(0,0')  is  continuous  in  both  arguments  on  a  compact  set 

B. 

(identification):  Assumption  A5. 13(i)  holds. 

Then  lim  0  =  0°  a.e. 

n 

Proof: 

Step  1.   Uniform  convergence. 

The  proof  is  similar  to  Step  1  of  Theorem  5.2.   Theorem  7.2  of  Rao 

(1962)  implies 

lim    sup    |q*(0,0')  -  q*(0,0*)  |  =  0  a.e. 
0eB,0'eB   n 
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Step  2.   Identification. 

Let  d  (8,8°)  =  q*(B°,B°)  -  q*(8,B°).   We  will  show  that  8*3°  implies 

d(8,  8°)    >  0.      Consider, 

d(  6,  8°)   =     /E(d ,(  8,  3°)  |x. )dF     +  /  E(d   (  8,  8°)  NJdF 
UZ      t  z      x    uzc      t  c 

+    /   E(d(e,&°)|x.)dF   +    /  E(d.(e,e°)|x  )dF  , 
ucz      t  *      x    uczc      t  c 

where  UZ  =    {U  (  8)        Z  (8°)},   UZC  =    {U  (  8)        Z^(B°)}, 

XX  XX 

UCZ  =  {UC(8)   Z  (6°)},  and  UCZC  =  {U^(  8)   Z^(B°)}.   That  8*8°  implies 
d(8, 8°)  >  0  follows  from  the  first  two  terms  being  positive,  and  the 
last  two  nonnegative.   We  will  prove  this  for  the  first  and  last  terms 
only;  the  proof  for  the  remaining  terms  is  similar. 

Consider  the  first  term,  and  assume  without  loss  of  generality  that 
8?x  -6  <  0,  and  &°x  -6  >  0,  and  thus  (8?  -  8,)x^  <  0.  Since  x  eU  ( 8) ,  we 

It     —  /   t  IZt  L    X 

have    8,x  -6    >  0,   and    82x  ~S   <  0.      Therefore, 

E(dt(8,8°)|xteUZ)   =  Pr(Apt+1    <  0  |xft)   -  Pr(Apt+1    >  0  |xt)    >  0, 

where  the  inequality  follows  from  (B,-82)x   <  0,  and  M5.1. 

For  x   e  UCZC  assume  without  loss  of  generality  that  8,x  -6  >  0, 
and  B°-<$  >  0.   Since  x   e  u£(B),  we  have  6^-6  >  0  and  8^-6  >  0,  or 
8.x  -6  >  0  and  82x  -6  _<  0,  or  8,x  -6  <  0  and  82x  -6  >  0.  Therefore, 
evaluating  the  conditional  expectation  case  by  case,  we  find 

E(d  (6, 8°)  |xteUCZC)  =  Pr(Qt>6  |xt)  -  Pr(Qt>6|xt)  =  0,  or 
=  Pr(Q  >fi|x  )  >  0. 

Step  3.   lim  sup  |q*(6,8  )  -  q*(8,8°)|  =  0  a.e. 
8eB   n    n 


6.;. 


Let   Y  >  0   be  given.      Step   1   implies 

sup    |   q*(3,  3  )   -  q*(3,3  )|    <  Y/2  a.e. 
BeB         n         n  n 


for  sufficiently  large  n.   The  continuity  of  q*,  and  the  compactness  of 
B  imply 

sup  |  q*(  3,  3  )  -  q*(  B,  B°)  |  <  Y/2  a.e. 
BeB        n 

for  sufficiently  large  n  since  lim  B  =  B  a.e.   Applying  the  triangle 

inequality  we  get 

sup  |  q*(3,3  )  -  q*(B,  B°)  |  <  Y  a.e. 
BeB    n    n 

for  sufficiently  large  n,  which  is  the  desired  result. 

Step  4.   lim  3=3  a.e. 
r         n 

Let  N  be  an  open  neighborhood  of  B  and  define 
e  =  i*(B°,B°)  -   sup  q*(B,B°)  >  0 


BeN°nB 


where  the  existence  of  e  follows  from  Step  2,  and  the  compactness  of  B. 

Now  Step  3  implies  q*(3  ,3°)  >  q*(B  , B  )  -  e/2,  a.e.  for  large  n,  and 
v  v  n        n  n  n 

since  (&  ,&  )   maximizes  q*  we  have 
n  n  n 

q*(3  ,3°)  >  q*(B°,B  )  -  e/2  a.e.  (5.15) 

n   n       n     n 

Step  3  also  implies 

q*(B°,B  )  >  i*(B°,B0)  -  e/2  a.e.  (5.16) 

^n     n     n 
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for  large  n.  Adding  both  sides  of  (5.15)  and  (5.16)  we  get 


q*(B  ,  3  )  >    sup  g*(8,  B  )  a.e. 


9  < 


geNTlB 

and  therefore  6  eN  a.e.  for  sufficiently  large  n. 
n 

Q.E.D. 
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NOTES 


The  signum  function,  sgn(«),  is  defined  as  follows:   sgn(z)  -  1  if 

z  >  0,  and  sgn(z)  =  -1  if  z  <   0. 

2 
Another  significant  cost  is  that  no  distributional  theory  for 

maximum  score  estimators  is  currently  known. 

3 
Other  comparisons  with  the  so-called  order  condition  for 

identification  are  much  more  complicated,  and  beyond  the  scope  of  this 

paper. 


CHAPTER  6 
CONCLUDING  REMARKS  AND  DIRECTIONS  FOR  FURTHER  RESEARCH 


In  this  thesis,  I  have  proposed  several  new  solutions  to  the 
problem  of  generalizing  disequilibrium  models  and  their  estimators.   The 
empirical  example  in  Chapter  4  demonstrates  how  to  implement  many  of 
these  solution  in  practice.   However,  as  we  have  seen,  while  some  of  the 
solutions  solve  old  problems,  they  also  introduce  new  complications. 
For  example,  while  the  methods  presented  in  Chapter  3  eliminate  the  need 
to  specify  a  parametric  model  for  serial  correlation,  they  also 
introduce  the  complication  of  having  to  choose  a  single  covariance 
estimator  from  several  candidates.   Clearly,  some  of  the  results  fall 
short  of  completely  generalizing  disequilibrium  models  and  their 
estimators;  there  is  a  trade-off.   I  believe,  however,  that  this  thesis 
accomplishes  more  than  merely  shifting  the  problems  faced  by  empirical 
studies  from  old  ones  to  new  ones.   In  particular,  it  provides  a  solid 
foundation  for  further  research  by  clarifying  many  of  the  issues 
involved.   The  following  is  a  partial  list  of  directions  for  further 
research  on  the  problem  generalizing  disequilibrium  models  and  their 
estimators: 

(1)  the  consequences  of  restricting  the  conditional  probabilities 
Pr(Ap   >0 |D  >S  )  and  Pr(Ap   >0 |d  <S  )  to  be  invariant  with 
respect  to  t,  and  how  to  relax  this  restriction; 

(2)  the  problem  of  finding  an  optimal  covariance  estimator  when  the 
serial  correlation  is  modeled  by  mixing  conditions; 
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(3)  the  power  properties  of  the  serial  correlation  test  in  section  3.5; 

(4)  the  small  sample  properties  of  estimators  obtained  from  starting 
iterative  techniques  with  consistent  estimates,  but  stopping 
iteration  before  convergence; 

(5)  numerical  studies  examining  the  properties  of  the  maximum  score 
estimators  for  disequilibrium  models  relative  to  parametric 
estimators. 


APPENDIX 

A.  1  Inconsistency  and  Misclassif ied  Observations 

We  will  show  that  constraining  the  direction  of  the  price  change 

l(Ap   >0)  to  separate  the  sample  into  the  underlying  demand  (Q  =D  )  and 

supply  (Q  =S  )  regimes,  when  in  fact  l(Ap   >0)  misclassif ies 

observations  with  positive  probability,  leads  to  inconsistent  estimates. 

Consider  the  estimator  9  (1,0)  which  solves  the  problem 
n 

max       L  (9,p   ,p  )  subject  to  (p^  ,p10)  =  (l  ,0) , 
(9,PirP10)n 

where  L  (9,p   ,p  .)  is  defined  on  page  14,  equation  2.3.   We  will  show 

that  p°  <1  and  P°0=0  imply  plim  9^1,0)29°.   The  proof  of  plim 

9  (1,0)^9°  proceeds  as  follows:   we  derive  a  necessary  condition  for  the 
n 

consistency  of  an  estimator  that  solves  a  maximization  problem,  show 
that  the  condition  is  violated,  and  hence  conclude  plim  9  *9  . 

The  necessary  condition  for  consistency  can  be  viewed  as  either  a 
global  or  local  condition  depending  on  whether  the  estimator  is  a  global 
or  local  maximizer  of  L  .   The  global  condition  appears  as  the 
conclusion  of  the  following  theorem. 

Theorem  A. 1.1.   Let  9  (y)  be  a  function  of  the  observations  such  that 
n 

L  (9  ,y)>L  (9,y)  for  all  n  and  all  9eE,  where  5  is  a  subset  of  a 
n  n   —  n 

Euclidean  space.   Define 

L  (9, 9',y,p)  =  sup{L  (t,y)-L  (9' ,y):  |t-9|_<p}, 


and  let  L  (  0,  0' ,  p)  =E(L  ((  9,  0' ,y,  p)  )  .   Suppose 
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(i)  For  all  sufficiently  small  p(G)-p>0, 

plim(Ln(  0,  0'  ,y,  P)-Ln(0,  0'  ,  p))=0. 

(ii)  L  (0,  0'  ,  p)  decreases  to  L  (©,  Q',0)  uniformly  in  n  as  p  decreases 
n  n 

to  zero. 
If  plim  0=0°,  then  lim  sup  {L  (0  ,0,O)}_>O  for  all  0eH. 


Proof: 

Suppose  there  exists  0*eS  such  that  lim  sup  {L  (0  ,0*,O)}<O. 
Then  by  (ii)  we  can  choose  p>0  such  that  lim  sup  {L  (0  ,0*,p)}<O.   Now 
define  N={0:  1 0-0  |_<p},  and 

R  =sup{L  (t,y)-L  (0*,y):  lt-0°  Up} 
n      n       n        '     '— 

Since  0  eN  implies  R  >0,  it  suffices  to  show  that  lim  Pr(R  <0)=1. 
n  n—  n-**>     n 

Let  M  =  L  (  0,  0*,n)  and  d=lim  supM  <0.   Now  for  sufficiently  large 
n    n      y  n    n 

n  we  have  M    <ri/2   since  d<0.      Therefore, 
n 

Pr(R   <d/4)  =  Pr(R  -M   <(-d/4)   -  M  )    >  Pr(R  -M   <-d/4)    +1  asn  +  »by  (i). 
n—  n     n—  n     —  n     n— 

Q.E.D. 


Under  additional  regularity  conditions,  the  conclusion  of  Theorem 
A.  1.1  can  be  viewed  as  a  local  condition. 

Theorem  A. 1.2.   In  addition  to  A.l.l(i)  and  A.l.l(ii),  suppose 

(i)   ~SL  (0)/30=3L  (0)/90;  that  is,  the  order  of  integration  and 
n        n 

differentiation  can  be  interchanged, 
(ii)   0  is  an  interior  point  of  5. 
(iii)  3L  (0)/30is  continuous  on  a  closed  neighborhood  N,  of  0  with 


1 


radius  e,  >0,  for  all  n  sufficiently  large. 


Let  9L  (o)/90.=L  (0)..   If  for  some  i  there  exists  a  positive  constant 
n      1  n   1 
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m  such  the  |L  ( 0) .  |>  m.  for  all  0  belonging  to  a  closed  neighborhood  of 

l  '  n   l  '—  l 

0°  with  radius  e2>0,  N„,  for  all  n  sufficiently  large,  then  plim  0n*0  . 

Proof: 

We  will  prove  plim  0  *0  by  showing  that  the  hypothesis  of  the 

theorem  implies  lim  sup  {L  (0°)-L  (0*)}<O  for  some  sequence  (0*) 
r        n     n      n  n  n 

belonging  to  H. 

Let  e  =min(e.  ,  e„).   Since  N„  is  compact  and  L  (0)  is  continuous  on 
3      12  3  n 

N„,  there  exist  points  0*  belonging  to  N,  such  that  L  (0*)=sup{L  (  0)  :  0 
3  n-j  nn       n 

belongs  to  N~  }.   Furthermore,  since  |l  ( 0) .  |  >0  on  N2 ,  the  points  ©£  lie 
on  the  boundary  of  N,.   Therefore,  |0*-0  |=eo- 
By  the  mean  value  theorem  we  have 


L  (0*)-L  (0°)  =   E  (0*  .-0T)L  (0').,  (2) 

n  n   n       ._,    n,i  l  n  n  i  ' 


where  0'  lies  on  the  segment  connecting  0*  and  0  .   Now  if  L  (0').>m  >0, 
n  n  nnli 

then  we  must  have  0*  .-0?>O.   Otherwise,  since  L  is  strictly  increasing 
n,i  l—  n 

in  its  i-th  argument  on  N0 ,  we  would  have  L  (0*   ,  .  .  .  ,  0  , . . . ,  0*  ,  )  > 

■j  Tiriji  l  rijK 

L(0*  ,,...,0*  ......  0*  ,  )  which  contradicts  the  fact  that  0*  is  a 

n     n,l'  n,i'  n,k  n 

maximizer  of  L    .      Similarly,    if  L   ( 0'  )j<m.<0,    then   0      .-O.<0. 
n  J'  nnJj  n,jj- 

Without  loss  of  generality  suppose 


L  (0').>mJ>O  for  i=l,...,h  and 
n  n  l  l 

L  (0*  ).<m.<0   for  i=h+l,...,k. 
n  n  i  l 

Then  by  equation  (2)  we  have 


h  K 

L_(0*)-L  (0°)  >      E   (0*  ,-0?)m,  +    E   (0°-0*  ,)(~m,) 
1-1 


n  n   n 


>m   E    |0*  .-0?|>m.d>O, 
"  i-1    n'X  i,~ 
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for  some  d>0,  where  n^mindn,  , .  .  .  ,m,  ,  -m,  .  ,  .  . .  ,-m,  )  .   This  implies 


lim  sup{L  (G°)-L  (G*)}<0. 
n     n      n  n 

Q.E.D. 


Therefore,  to  prove  plim  0  (1,O)*0  ,  it  suffices  to  show  that 

8L  (0°;l,O)/9g1  is  bounded  away  from  zero.   We  establish  this  by  showing 
n  1 

that 

EOLn(0°;l, 0)/3B1)  =  (l-pjj)  I   xtE(Qt"Dt)/(Jel-  (3) 

Let  31ogf  (0°;1,O)/3B1  =  f   ,    K •)=!( Apt+1 >0) ,  and  note  that 


E(ft}  =  ^ftft(Qt'1(Ol9°'Pll'P10"0)  dQf 


(4) 


Now 


if  p?i=l»  then  (4)  is  the  expectation  of  a  likelihood  equation,  and 


therefore  given  the  usual  regularity  conditions  we  have  E(f  )-0  at 
p1  =1.   This  condition  will  imply 

-1(0  /fj88tdQt-(l-K0)  /fJgdtdQt  (5) 

Substituting  (5)  into  (4)  yields 

ECfJ)  =  (l-l(.))(l-p°1)  C/fJgdtdQt  +  /  fJg8tdQt).  (6) 

For  1(0=0,  given  the  normality  of  £lt:'E2t  we  have  f  t=^t~Xt^l^Xt/'ael " 
Substituting  this  into  (6),  and  summing  over  the  observations  gives  (3). 

A. 2  The  Computational  Tractability  and  Asymptotic  Properties  of  the 

Least  Squares  Estimator  of  Section  2.3 

In  Section  2.3  we  proposed  using  a  LS  estimator  to  find  the 
consistent  and  asymptotically  normal  solution  to  the  likelihood 
equations;  i.e.,  use  the  LS  estimates  as  starting  values  to  iterate  to 
the  consistent  and  asymptotically  normal  local  maxima  of  the  likelihood 
function.   The  success  of  this  strategy  depends  on: 
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(a)  The  objective  functions  to  be  solved  for  the  LS  estimates  are  not 
characterized  by  an  unknown  number  of  local  minima  so  that  global  minima 
can  be  easily  found;  i.e.,  multiple  solutions  are  not  a  problem. 

(b)  The  LS  estimators  (defined  as  global  minimizers)  are  consistent  and 
have  a  proper  limiting  distribution. 

If  (a)  fails,  then  the  LS  method  is  no  more  computationally  tractable 
than  the  ML  method,  and  thus  one  might  as  well  use  the  ML  method  to 
begin  with,   (b)  ensures  convergence  to  the  consistent  and 
asymptotically  normal  local  maxima  of  the  likelihood  function.   (See,  for 
example,  Amemiya  (1973,  pp.  1014-15).)   In  this  section  we  will  argue 
that  both  (a)  and  (b)  are  likely  to  be  satisfied  in  practice. 

Condition  (a)  will  be  obviously  satisfied  if  the  following 
optimization  problems  have  unique  solutions: 


-1   n  2 

local-min  n    E  (K  4p  j  >0)-E(l(  Ap  x  >0) ))  (1) 

(pn,P10,Y) 

-1   n      *     2 
local-min  n    E  (Q  -E(Q  ))  (2) 

t=l 

(6re2'a2ei+4) 


local-min  n  l      E  (Q^-E(Q^))2  (3) 

t=l 

.2       2  . 

(ael'0e2) 

where  E(Q  )  denotes  the  function  E(Q  )  with  y     estimated  by  y   (obtained 
from  (1)),  and  E(Q^)  denotes  E(Q^)  with  6°,  6°,  ( <*  i+cr^)  ,  and  y° 
estimated  by  $.  ,  8~ ,  (a  ,+a   2),  and  y     (obtained  from  (1)  and  (2)). 
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Solutions  to  problems  (2)  and  (3)  are  OLS  estimates,  and  therefore 
are  unique  if  the  appropriate  matrices  of  explanatory  variables  have 
full  column  rank.   For  example,  unique  LS  estimates  can  be  obtained  by 
solving  (2)  if  the  following  matrix  has  full  column  rank: 


s         d 
(l-$(x1y))x1,  $(x1y))x1,  (^Cxjy) 


(l-$(x  y))x  ,  $(x  y))x  ,  d>(x  y) 
n    n     n    n     n 

where  xS  denotes  the  lxk  vector  of  explanatory  variables  of  the  supply 
equation,  and  x  the  lxk  vector  of  demand  explanatory  variables.   In 
general,  the  matrices  of  explanatory  variables  for  (2)  and  (3)  will  have 
full  column  rank  provided  that  the  functions  $(x  y)  and  (}>(x  y)  are  not 
constant  for  all  t. 

Solutions  to  problem  (1)  are  nonlinear  LS  estimates,  and  conse- 
quently establishing  their  uniqueness  is  much  more  difficult.   Unfortun- 
ately, attempts  to  prove  that  problem  (1)  has  a  unique  solution  have 
been  inconclusive.   However,  there  is  some  evidence  suggesting  that 
problem  (1)  can  be  solved  for  a  global  minimum  in  practice.   First,  the 
larger  the  sample  size  the  more  likely  problem  (1)  will  have  a  unique 
solution.   Lemma  A. 2. 4  below  provides  a  rank  condition  which  ensures  a 
unique  solution  with  probability  approaching  one  as  n  approaches 
infinity.   Second,  given  the  data  discussed  in  Chapter  4,  attempts  to 
solve  problem  (1)  were  successful  in  the  sense  that  all  starting  values 
iterated  to  the  same  solution.   In  contrast,  attempts  to  maximize  the 
likelihood  function  were  unsuccessful  as  different  starting  values 
iterated  to  different  solutions.   Third,  the  objective  function  in 
problem  (1)  is  bounded  below  (by  zero)  which  simplifies  the  search  of 
the  parameter  space  for  a  global  minimum.   In  contrast,  a  search  for  a 
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global  maximum  of  the  likelihood  function  is  complicated  by 

2        2 
unboundedness:   L  ■**>  as  a   .  -*0  or  a   „-K),  (see,  for  example,  Maddala 

(1983,  p.  300)).   Therefore,  any  search  for  a  global  ML  estimate  will  be 

futile  unless  one  is  willing  to  arbitrarily  bound  the  error  variances 

away  from  zero. 

Next  we  discuss  conditions  that  imply  consistency  for  the  LS 

estimator.   We  will  only  consider  conditions  that  imply  consistency  for 

the  nonlinear  LS  estimator  defined  as  any  global  minimizer  of  problem 

(1).   (Given  plim  y=y°,    proving  consistency  for  the  OLS  estimators 

obtained  from  solving  problems  (2)  and  (3)  involves  repeated  application 

of  Jennrich's  (1969,  Lemma  3)  mean-value  theorem  for  random  functions, 

and  is  quite  tedious.)   For  simplicity,  rather  than  necessity,  we  will 

assume  that  all  relevant  random  variables  are  independent  identically 

distributed  across  t.   This  enables  us  to  apply  the  following  simplified 

version  of  white's  (1980)  Lemma  2.2  to  the  global  minimizer  of  problem 

(1). 

Lemma  A. 2.1.   Let  Q  (w,  0)  be  a  measurable  function  on  a  measurable  space 
n 

W  and  for  each  w  in  W  a  continuous  function  on  a  compact  set  5.   Then 
there  exists  a  measurable  function  0  (w)  such  that 


Q  (w,0  (w))=inf  Q  (w,  0)  for  all  w  in  W. 
xn    n       _  n 

0e  = 


If  plim  {sup  |Q  (w,0)-Q(0)  |}=0,  and  if  Q(0)  has  a  unique  minimum  at 
0e~  n 

0°,  then  plim  0=0. 
'      r     n 

Proof:   See  White  (1980,  Lemma  2.2). 


The  first  part  of  lemma  A. 2.1  ensures  the  existence  of  the 
nonlinear  LS  estimator  (defined  as  a  global  minimizer).   The  second  part 
will  be  used  to  show  consistency.   For  this  purpose  we  define, 
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Q  (G)=  n  X   E  (l(Ap    >0)-E(l(Ap   >0)))2 
n  .      t+1         rt+l 

=  n"1  I   (z  (0)  +  u   )2, 
t=l 

where  *t(Q>P°n-Pn   "  <Pn-pJ0)  K\y0)   +  (pn"P10)  $UtY) '  and 
&=(p].,p]n,Y).   To  apply  the  second  part  of  Lemma  A. 2.1  we  need  to  show 
uniform  convergence,  and  that  Q(0)  has  a  unique  minimum  at  0  .   The  next 
lemma,  which  is  due  to  Hoadley  (1971),  provides  a  moment  restriction 
that  implies  uniform  convergence. 

Lemma  A. 2. 2.   For  the  function  defined  in  Lemma  A. 2.1  suppose 

e|q  (0)  |1+d%«»  for  some  d>0.   Then  plim  (sup  |q  (  0)-Q  (0)  |H). 
n  0eS  n     n 

Proof:   See  Hoadley  (1971,  Theorem  A. 5). 

The  following  lemma  establishes  that  the  moment  restriction  holds. 

Lemma  A. 2. 3.   E  |q  (0)  |    <m<°°  for  d>0. 

n       — 

Proof:   Since  z  ( 0)  is  bounded  we  have 

(zt(0)+ult)2^2.zt(0)2+2.u2t%+2.u2t. 

Therefore,    the   conclusion  of   the  lemma  follows   if  E  |u1     |       _<m>   d>0.      Let 
1   =l(Ap        >0),    set  d=l,   note   that  E  |l     |k=E(lt)  «*>,   k=l,2,...,   and  recall 
that  u     =lt-E(lt).      Thus, 

E  |ult  |3^E  |Ujt  N  |lt  |3+3.E  |l2.E(lt)  |+3. E  1 1  fc  (EC  1  fc  )  )2  |+  |E(lt)  |3%. 
Q.E.D. 

Finally,  we  present  a  rank  condition  that  implies  Q(0)  has  a 
unique  minimum  at  0  ,  and  therefore  together  with  Lemma  A. 2. 3  ensures 
consistency  for  a  global  minimizer  of  Q  ( ©) . 
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Lemma  A. 2. 4.   Suppose  x  is  a  discrete  random  variable,  and  let  x 
denote  the  i-th  member  of  the  support  of  x  .   For  each  ©eE  such  that 
0*0  ,  suppose  there  exists  k_>l  members  of  the  support  of  x  such  that 
the  following  matrix  has  full  column  rank: 

\  =      I  $(x  y)    $(x  y  ) 

i     $(x  y)    $(x  Y  ) 

If  pn>pin,  then  Q(G)  has  a  unique  minimum  at  0  . 

Proof:   Since  E(u.  |x  )=0,  we  have 

Q(0)=E(zt(0)+ult)2=E(zt(0)2)+E(ult2). 

Obviously,  Q(0)  has  a  minimum  at  0  since  E(z  (0  )  )=0.   To  prove 
uniqueness  it  suffices  to  show  that  0*0  <=>  E(z  (0)  )>0. 

Suppose  for  some  0*0°,  E(z  ( 0)  )=0.   Since  Pr(z  ( 0)  _>0)=1,  we  have 

2  2  i 

E(z  (0)  )=0<=>Pr(z  (0)  =0)  =  1.   This  implies  that  for  every  x  belonging 

2 
to  the  support  of  x  , z  ( 0)  =0.   That  is, 

Pii^ir^irpio^^t^^^irPio^^t^"0  li=1»2»**- 

But  this  contradicts  the  assumption  that  A,  has  full  column  rank  unless 

Pirpii=pirPio=PirPio=0- 

Q.E.D. 

Finally, we  note  without  proof  that  Theorem  3.1  of  White  (1980)  can 
be  applied  to  show  that  the  nonlinear  LS  estimator  obtained  from  solving 
problem  (1)  is  asymptotically  normal.   Therefore,  the  LS  estimates  have 
a  proper  limiting  distribution. 
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A. 3  Proofs  of  Theorems  3.2-3.17 

Proof  of  Theorem  3.2:   See  McLeish  (1975,  Theorem  2.10). 

Proof  of  Theorem  3.7:   The  proof  is  the  same  as  Hoadley's  (1971) 
Theorem  1  except  Theorem  3.2  is  applied  instead  of  Markov's  law  of  large 
numbers . 

Proof  of  Theorem  3.8:   For  notational  simplicity  let  1  =l(Ap   >0). 

Consider  an  arbitrary  point  0*e5.   We  will  show  that  given  e>0  there 

exists  d>0  such  that  lo  —  0*  I  <d  implies 
P  P  ~ 

|ftCQt,lt|ep)-£t(Qt,lt|ej)|<ea.e. 

where  e  and  d  do  not  depend  on  t. 

Assumptions  3.4(i)  (normality)  and  3.6(i)  (compactness)  imply 

lim  sup{|f  (Q  1  |0  )-f  (Q  ,1  |ej)|:6  e~,    0*e~}=0  (1) 

vt 

Let  £>0  be  chosen.   Then  equation  (1)  implies  that  there  exists 

a  =a(x  ,1  )>0  and  d  =d(xt>  lt)  >0  such  that  for  |Qt  |  >&t  and  1 9-6*1^  we 

have 

ivv^iy-vv1^^  (2) 

By  assumption  3.5(ii)  (x  has  a  finite  support)  equation  (2)  holds  a.e. 

for  |q  |>a=max(a.,...,a1  )  and  |0  -0*  |  <danin(d1 , . . .  ,<L  ) .   Thus,  it 
1  t         Ik        p  p  —       1      k 

remains  to  show  that  equation  (2)  holds  a.e.  for  Q  belonging  to  [-a, a]. 

Let  C={(Q  ,1  ,x  ,0  ):Q  belongs  to  [-a, a]  }.   Since  C  is  compact, 
and  f  (Q  ,1  |0  )  is  continuous  on  C,  it  follows  that  f  (Q  ,lt |Q  )  is 
uniformly  continuous  on  C.   That  is,  there  exists  a  d>0  such  that 
equation  (2)  holds  a.e.  uniformly  in  t  whenever  |0  -9*|_<d. 
Q.E.D. 
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Proof  of  Lemma  3.9;  The  result  follows  from  the  fact  that  5  is 
separable  and  f  (Q  ,  1( Ap  ,  >0) 1 0  )  is  continuous  on  5.   See,  for 
example,  Loeve  (1960,  p.  510). 

Proof  of  Lemma  3.10:   Hartley  and  Mallela  (1977,  Corollary  4.2) 
prove  that  there  exists  p( 0  ) >0  such  that 

ElsupUnf   (()    .KAp^.^)  |e'):  |0*-0|<p(0  )}|k<A<°°,  (3) 

1  t      t  t+1  p  p  p         — 

for  k=2.   In  fact,  their  arguments  can  be  used  to  show  that  (3)  holds 
for  any  even  positive  k,  and  therefore  for  any  positive  k. 

Proof  of  Lemma  3.11;  The  proof  involves  minor  modifications  to  the 
proofs  given  in  Amemiya  and  Sen  (1977,  lemmas  2  and  3)  to  cover  the  case 
of  PU*P10. 

Proof  of  Lemma  3.14:   See  White  (1984,  Theorem  2.4). 

Proof  of  Lemma  3.15:   By  Theorem  2.3  of  White  and  Domowitz  (1984), 
assumptions  3.15(i)  and  3.15(ii)  imply 


-1 
plim  {sup  n    E  (q^(y.  ,  0)-q1_(y<_,  0))  }=0.  (4) 

0eS     t=l   t  C 


Given  (4)  and  plim  0  =0  ,  Lemma  2.6  of  White  (1980)  implies 
n 

plim  n"1   E   (qt(yt,0n)-qt(yt,0°)))=O. 
Q.E.D. 

Proof  of  Theorem  3.16:   See  Newey  and  West  (1985,  Theorem  2). 

Proof  of  Theorem  3.17: 

Step  1.   n   a   =  (n  f   (0  )  f   (0  ))   n  f   (0  )  f   ( 0  )  • 
K        n        -c  p   -c  p        -c  p   -c  p 
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We  will  show  that  Step   1   follows   from 

nf    als  =   (n"1!      (efVf      (gT1))"^^      ((fVf CeP1).  (5) 

n  -c        n       -c       n  -en  n 

Given  3.17(i),    the  mean-value  theorem  for  random  functions 
(Jennrich   (1969,   Lemma  3))   allows  us   to  write 

f(G?nl)=f(0O)+Of(0  )/9G  )(G?nl-0O),    and  (6) 

n  p  n  p       n       p 

f      CeP1).^      (0°).  +  (3f      (0  )./30  )(0ml-0°),    i=l,...,k,  (7) 

-cni-cpi  -cnipnp 

where  f   ( 0™  ) .  denotes  the  i-th  column  of  the  matrix  f   (0  ),  and  0 
-c  n  l  _c  n        n 

and  0  each  lie  on  the  segment  connecting  G   and  0  . 
n  n       p 

Given  (7),  3.17(ii),  and  plim  0  =0  ,  we  have 

n_1f  (tfV.'f      CGP1)^"1!   (0°)Tf   (0°).+o(l).  (8) 

-c  n  l  -c  n  j     -c  p  l  -c  p  j  p 

Given   (6),    (7),    3.17(11),    3.17(111),   H     and  plim   gT^©0 ,   we  have 

n~*f     (^-^fcrt-n^f     (0°)Tf(0°)+o   (1).  (9) 

-cnin  -cpipp 

Substituting  (8)  and  (9)  into  (1)  we  get  the  desired  result: 

nV^n^f   (0°)Tf   (0°)+o  (l))_1n_if   (0°)Tf(0°)+o  (1). 
n       -c  p   -c  p   p         -c  p     P   P 

i   IsT  _  ,-inlv-l  Is  A  2 
Step  2.   n  a    D  (0  )   a   ^  \v. 
n    n  n     n     K 

By  Step  1  we  can  write 

nVS  -A  (0°)_1  n~*f   (0°)Tf(0°)  = 
n     n  p        -c  p     p 

[(n_1f   (0°)Tf   (0°))_1-A  (0°)_1]n"}f   (0°)Tf(0°)+o  (1) 
-c  p   -c  p      n  p        -c  p     P   P 

Therefore,  by  3.17(iv),  3.17(v),  and  3.17(vi),  we  have 
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plim   (D   (9°)"Va1S  -  D   (0°)_iA   (0°)_1n  *f      ( 0°)Tf  ( 0°)  )=0  (10) 

v  n     p  n  npnp  -cp  p 

Given  3.17(vi),  by  Corollary  4.24  of  White  (1984), 

A 

D    (0O)_iA   (oVV^f      (0°)Tf(0°)    <vN(0,L).  (ID 

npnp  -cp  p  K 

(6)   and   (7)    imply, 

A 


D   (0°)    Vals   ^N(0,I,  ), 
n     d  n  k 


and  therefore  by  Corollary  4.28  of  White  (1984)  we  have 

Is  ~  z^o-,-1  Is  „   2 
n  a   D  (0  )   a   ^  yu  . 
n   n  p    n    ^c 

Finally,  since  plim  (D  (eP1)"1^  (0°)-1)=O,  by  Theorem  4.30  of  White 
"       r      n  n      n  p 

(1984) 

Is  _  ,  -pIn-I  Is  a   2 
n  a   D  (GT  )   a   <v<  y,. 
n   n  n     n    ^c 

Q.E.D. 

A. 4  Quadratic  Hill-Climbing  and  the  Asymptotic  Distribution  of  the 

(p+l)th-Round  Estimates 

The  (p+l)th  (p=l,2,...)  iteration  of  the  quadratic  hill-climbing 
technique  is  given  by 

qP+1=^  _  (y^  (QP)-ai)-1  VL  (eP)  (1) 

n    n       n  n   n       n  n 

where  a  =  max  (  X  +r  I  I  VL  ( 015)  I  1 ,0)  ,  X     is  the  maximum  eigenvalue  of 
n        n   '  '  n  n        n 

A  (e11),  r  is  a  scalar  correction  factor,  and  |  |VL  (Gr)  |  |  denotes  the 
n  n  n  n 

length  of  the  k  dimensional  vector  VL  ( Gr) . 

Goldfeld,  Quandt,  and  Trotter  (1966)  show  that  the  technique 

chooses  0?   to  maximize  the  quadratic  approximation  of  L  ( 0)  on  a 
n  u 

region  centered  at  Gr  of  radius 
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|(A  (0P)-a  I)  *  VL  (0P)||<l/r. 


n  n  '  '— 

If  the  quadratic  approximation,  is  good,  (that  is,  if  the  step  increases 

L  (9)),  then  in  the  next  step  r  is  decreased.   Otherwise  r  is  increased. 

n 

Further  details  can  be  found  in  Goldfeld,  Quandt,  and  Trotter  (1966). 

Next  we  show  that  the  estimator  defined  by  0^   has  the  same 

n 

asymptotic  distribution  as  the  partial-MLE  provided  that  plim  GT=0  and 
^n"  (eP-0°)  has  a  proper  limiting  distribution.  More  explicitly,  we  show 

£  (0P+1-eo)  =  (ifVL  (0°))_1  n~**L  (0°).  (2) 

n  n  n 

The  implication  is  that  when  consistent  initial  estiamtes  are  employed, 
iteration  beyond  the  second-round  does  not  improve  the  final  estimates, 
at  least  asymptotically. 

To  prove  (2),  it  suffices  to  show  that  plim  n  a  =0.   To  see  this, 
consider  the  mean-value  expansion 

VL  (eP)  =  VL  (0°)  +  A  (0  )(0P-0°).  (3) 

n  n      n         n  n   n 

Substituting  (3)  into  (1)  and  rearranging,  we  get 

^(qP+1_qO)  _  (n^al-n'VL  (O^rV^VL  (0°) 
n  n        n  n        n 

=  [I-(n"VL  (eP)-n_1a  D'Wl  (0  )]  v£(©P-0O).  (4) 

n  n      n         n  n       n 

Therefore,  if  plim  n~  a  =0,  then  (2)  follows  from  (4)  since  the  right 
r        n 

hand  side  of  (4)  converges  in  probability  to  zero. 

The  following  theorem  establishes  that  plim  n  a  =0. 


Theorem  A. 4.1.   For  VL  ( 0)  =  E  31ogf  ( 0)/ 30,  suppose 
n     t=1     t 


(i)   plim  sup  n    I   [  31ogf  (  0)/  30  -  E(  31ogf  (  0)/ 30)  ]  =  0. 
0     t=l 
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(ii)   plim  ^=0°. 
n 

(iii)  E( 3logf  (9)/ 39)  is  continuous. 

(iv)   plim  n   X  *0. 
n 

Then  plim  n   a  =0. 
n 

Proof: 

If  suffices  to  show  that  plim  n    VL  (gP)  I  =0.   Now 

r  n  n 

n"1  |  |VL  (eP)||  =  n_1(i:  VL  (©P)2)* 
n  n         ._,   n  n  l 
i-1 

=  n_1(i:  (Z  8logf  (9p)/80.)2)i 
i=l  t=l     t  n    l 

k  ■  -ln         -     '  P 


<  E  |n  XI   8logf  (0P)/90.  |  +  0 
— .   '    n    °  t  n    l 
i=l   t=l 


by  (i),  (ii)  and  (iii). 
Q.E.D. 

In  effect,  the  proof  of  (2)  follows  from  the  observation  that  if 
plim  n   a  =0,  then  for  sufficiently  large  n  equation  (1)  reduces  to  the 
Newton-Raphson  technique  with  probability  approaching  one.   Given  that 
the  proof  depends  on  (1)  reducing  to  the  Newton-Raphson  technique 
asymptotically,  why  not  use  the  latter  to  begin  with?  Unfortunately,  a 
definitive  answer  to  this  question  is  not  available.   The  answer  lies  in 
the  small  sample  properties  of  the  estimators,  which  undoubtably  would 
require  Monte  Carlo  studies  to  help  uncover.   We  have  chosen  quadratic 
hill-climbing  over  Newton-Raphson  because  it  is  somewhat  reassuring  to 
know  that  the  former  always  moves  in  the  direction  of  a  maximizer  of  the 
likelihood  function,  while  the  latter  might  not. 


BIBLIOGRAPHY 


Amemiya,  T.  (1973).   "Regression  Analysis  when  the  Dependent  Variable 
is  Truncated  Normal."  Econometrica  41:997-1016. 

Amemiya,  T.  (1974).   "A  Note  on  the  Fair  and  Jaffee  Model." 
Econometrica  42:759-762. 

Amemiya,  T.,  and  G.  Sen  (1977).   "The  Consistency  of  the  Maximum 

Likelihood  Estimator  in  a  Disequilibrium  Model."  Technical  Report 
238.   Institute  for  Mathematical  Studies  in  the  Social  Sciences, 
Stanford  University. 

Benassy,  J. P.  (1982).   The  Economics  of  Market  Disequilibrium.   New 
York:  Academic  Press. 

Bowden,  R.J.  (1978).   The  Econometrics  of  Disequilibrium.  Amsterdam: 
North  Holland. 

Cosslett,  S.R.  (1983).   "Distribution-free  Maximum  Likelihood  Estimator 
of  the  Binary  Choice  Model."  Econometrica  51:765-782. 

Fair,  R.C.,  and  D.M.  Jaffee  (1972).   "Methods  of  Estimation  for  Markets 
in  Disequilibrium."  Econometrica  40:497-514. 

Fair,  R.C.,  and  H.H.  Kelejian  (1974).   "Methods  of  Estimation  for 
Markets  in  Disequilibrium:   A  Further  Study."  Econometrica 
42:117-190. 

Fisher,  F.M.  (1983).   Disequilibrium  Foundations  of  Equilibrium 
Economics.   New  York:   Cambridge  University  Press. 

Goldfeld,  S.M.,  and  R.E.  Quandt  (1975).   "Estimation  in  a 

Disequilibrium  Model  and  the  Value  of  Information."  Journal  of 
Econometrics  3:325-348. 

Goldfeld,  S.C.,  R.E.  Quandt,  and  H.F.  Trotter  (1966).   "Maximization 
by  Quadratic  Hill-climbing."  Econometrica  34:541-551. 

Gordin,  M.I.  (1969).   "The  Central  Limit  Theorem  for  Stationary 
Processes."  Soviet  Mathematics  10:1174-1176. 

Hartley,  M.J.,  and  P.  Mallela  (1977).   "The  Asymptotic  Properties  of  a 
Maximum  Likelihood  Estimator  for  a  Model  of  Markets  in 
Disequilibrium."  Econometrics  46:1251-1271. 


86 


87 


Heckman,  J.J.  (1976).   "The  Common  Structure  of  Statistical  Models  of 
Truncated,  Sample  Selection  and  Limited  Dependent  Variables  and  a 
Simple  Estimator  for  Such  Models."  Annals  of  Economic  and  Social 
Measurement  5:475-492. 

Hoadley,  B.  (1971).   "Asymptotic  Properties  of  Maximum  Likelihood 

Estimators  for  the  Independent  Not  Identically  Distributed  Case." 
Annals  of  Mathematical  Statistics  42:1977-1991. 

Ito,  T.,  and  K.  Ueda  (1981).   "Tests  of  the  Equilibrium  Hypothesis  in 

Disequilibrium  Econometrics:  An  International  Comparison  of  Credit 
Rationing."  International  Economic  Review  22:691-708. 

Jennrich,  R.I.  (1969).   "Asymptotic  Properties  of  Non-linear  Least 

Squares  Estimators."  Annals  of  Mathematical  Statistics  40:633-643. 

Laffont,  J.J.  and  R.  Garcia  (1977).   "Disequilibrium  Econometrics  for 
Business  Loans."  Econometrica  45:1187-1204. 

Lee,  L.F.,  and  R.H.  Porter  (1984).   "Switching  Regression  Models  with 
Imperfect  Sample  Separation  Information  —  With  an  Application  on 
Cartel  Stability."  Econometrica  52:391-418. 

Levine,  D.  (1983).   "A  Remark  on  Serial  Correlation  in  Maximum 
Likelihood."  Journal  of  Econometrics  23:337-342. 

Loeve,  M.  (1960).   Probability  Theory.   2nd  ed.  Princeton:   Van 
Nostrand. 

Maddala,  G.S.  (1983)   Limited-dependent  and  Qualitative  Variables  in 
Econometrics.  New  York:   Cambridge  University  Press. 

Maddala,  G.S.,  and  F.  Nelson  (1974).   "Maximum  Likelihood  Methods  for 
Markets  in  Disequilibrium."  Econometrica  42:1013-1030. 

Manski,  C.F.  (1975).   "The  Maximum  Score  Estimation  of  the  Stochastic 
Utility  Model  of  Choice."  Journal  of  Econometrics  3:205-228. 

Manski,  C.F.  (1983).   "Closest  Empirical  Distribution  Estimator." 
Econometrica  51:305-320. 

Manski,  C.F.  (1985).   "Semiparametric  Analysis  of  Discrete  Response: 

Asymptotic  Properties  of  the  Maximum  Score  Estimator."  Journal  of 
Econometrics  27:313-333. 

McLeish,  D.C.  (1975).   "A  Maximal  Inequality  and  Dependent  Strong  Laws." 
Annals  of  Probability  3:826-836. 

Newey,  W.K.  and  K.D.  West  (1985).   "A  Simple,  Positive  Definite, 
Heteroscedasticity  and  Autocorrelation  Consistent  Covariance 
Matrix."  Discussion  paper  92,  Woodrow  Wilson  School,  Princeton 
University. 


Olsen,  R.J.  (1978),   "Note  on  the  Uniqueness  of  the  Maximum  Likelihood 
Estimator  for  the  Tobit  Model."  Econometrica  46:1211-1215. 

Powell,  J.L.  (1984).   "Least  Absolute  Deviations  Estimation  for  the 
Censored  Regression  Model."  Journal  of  Econometrics  25:303-325. 

Rao,  R.R.  (1962).   "Relations  between  Weak  and  Uniform  Convergence  of 
Measures  with  Applications."  Annals  of  Mathematical  Statistics 
33:659-680. 

Rudin,  W.  (1976).   Principles  of  Mathematical  Analysis.   New  York: 
McGraw-Hill. 

Serfling,  R.J.  (1968).   "Contributions  to  Central  Limit  Theory  for 
Dependent  Variables."  Annals  of  Mathematical  Statistics 
39:1158-1175. 

Sealy,  C.W. ,  Jr.  (1979).   "Credit  Rationing  in  the  Commercial  Loan 
Market:   Estimates  of  a  Structural  Model  Under  Conditions  of 
Disequilibrium."  Journal  of  Finance  34:689-702. 

Wald,  A.  (1949).   "Note  on  the  Consistency  of  the  Maximum  Likelihood 
Estimate."  Annals  of  Mathematical  Statistics  20:595-601. 

White,  H.  (1980).   "Nonlinear  Regression  on  Cross-Section  Data." 
Econometrica  48:721-746. 

White,  H.  (1984).  Asymptotic  Theory  for  Econometricians.   New  York: 
Academic  Press. 

White,  H.,  and  I.  Domowitz  (1981).   "Nonlinear  Regression  with  Dependent 
Observations."  Unpublished  paper,  University  of  California,  San 
Diego. 

White,  H.,  and  I.  Domowitz  (1984).   "Nonlinear  Regression  with  Dependent 
Observations."  Econometrica  52:143-162. 


BIOGRAPHICAL  SKETCH 

Walter  James  Mayer  was  born  in  Detroit,  Michigan,  in  1955.   He 
received  a  Bachelor  of  Arts  degree  in  economics  from  the  University  of 
Missouri  in  1982,  and  a  Master  of  Arts  degree  from  the  University  of 
Florida  in  1983. 


89 


I  certify  that  I  have  read  this  study  and  that  in  my  opinion  it 
conforms  to  acceptable  standards  of  scholarly  presentation  and  is  fully 
adequate,  in  scope  and  quality,  as  a  dissertation  for  the  degree  of 
Doctor  of  Philosophy. 


oLLi  K  •  LcKikjA 


Stephen  R.  Cosslett,  Chairman 
Associate  Professor  of  Economics 


I  certify  that  I  have  read  this  study  and  that  in  my  opinion  it 
conforms  to  acceptable  standards  of  scholarly  presentation  and  is  fully 
adequate,  in  scope  and  quality,  as  a  dissertation  for  the  degree  of 
Doctor  of  Philosophy.  /~ 


G.S.  Maddala 
Professor  of  Economics 


I  certify  that  I  have  read  this  study  and  that  in  my  opinion  it 
conforms  to  acceptable  standards  of  scholarly  presentation  and  is  fully 
adequate,  in  scope  and  quality,  as  a  dissertation  for  the  degree  of 


Doctor  of  Philosophy. 


/ 


hud 


A.I.  Khuri  ~~ — 

Associate  Professor  of  Statistics 


This  dissertation  was  submitted  to  the  Graduate  Faculty  of  Department  of 
Economics  in  the  College  of  Business  Administration  and  to  the  Graduate 
School  and  was  accepted  as  partial  fulfillment  of  the  requirements  for 
the  degree  of  Doctor  of  Philosophy. 


December,  1986 

Dean,  Graduate  School 


UNIVERSITY  OF  FLORIDA 


3  1262  08285  315  0 


